Say I have this piece of text:
[quote=XXXXXX]ABC[quote=YYYYYY]DEF[/quote]GHI[/quote]JKL[quote=ZZZZZ]MNO[/quote]
How can I remove all the text in between the [quote] and [/quote] tags as large as possible, i.e. the above text will become JKL
([quote=XXXXX]...[/quote] gets deleted and [quote=ZZZZZ]...[/quote] as well). Note that it shouldn't replace all text because it starts and ends with a quote tag, nor replacing [quote=XXXXX]ABC[quote=YYYYY]DEF[/quote]
. Is this even possible with regex?
Thanks for answering! :)
To match a nested structure you can write a recursive pattern (a pattern that refers to itself with (?R)
):
$pattern = '~\[quote\b[^]]*][^[]*+(?:\[(?!/?quote\b)[^[]*|(?R)[^[]*)*+\[/quote]~i';
$txt = preg_replace($pattern, '', $txt);
But if you think it's too complicated, you can also write a pattern for the innermost quoted parts and apply it until the count parameter of preg_replace
is set to zero:
$pattern = '~\[quote\b[^]]*][^[]*+(?:\[(?!/?quote\b)[^[]*)*+\[/quote]~i';
do {
$txt = preg_replace($pattern, '', $txt, -1, $count);
} while ($count);