This question has to do with PCRE .
I have seen a recursive search for nested parentheses used with this construct:
\(((?>[^()]+)|(?R))*\)
The problem with this is that, while the '[^()]+' can match any character including newline, you are forced to match only single-character characters, such as braces, brackets, punctuation, single letters, etc.
What I am trying to do is replace the '(' and ')' characters with ANY kind of pattern (keywords such as 'BEGIN' and 'END', for example).
I have come up with the following construct:
(?xs) (?# <-- 'xs' ignore whitespace in the search term, and allows '.'
to match newline )
(?P<pattern1>BEGIN)
(
(?> (?# <-- "once only" search )
(
(?! (?P=pattern1) | (?P<pattern2>END)).
)+
)
| (?R)
)*
END
This will actually work on something that looks like this:
BEGIN <<date>>
<<something>
BEGIN
<<something>>
END <<comment>>
BEGIN <<time>>
<<more somethings>>
BEGIN(cause we can)END
BEGINEND
END
<<something else>>
END
This successfully matches any nested BEGIN..END pairs.
I set up named patterns pattern1 and pattern2 for BEGIN and END, respectively. Using pattern1 in the search term works fine. However, I can't use pattern2 at the end of the search: I have to write out 'END'.
Any idea how I can rewrite this regex so I only have to specify the patterns a single time and use them "everywhere" within the code? In other words, so I don't have to write END both in the middle of the search as well as at the very end.