Given a dummy function as such:
public function handle()
{
if (isset($input['data']) {
switch($data) {
...
}
} else {
switch($data) {
...
}
}
}
My intention is to get the contents of that function, the problem is matching nested patterns of curly braces {...}
.
I've come across recursive patterns but couldn't get my head around a regex that would match the function's body.
I've tried the following (no recursion):
$pattern = "/function\shandle\([a-zA-Z0-9_\$\s,]+\)?". // match "function handle(...)"
'[\n\s]?[\t\s]*'. // regardless of the indentation preceding the {
'{([^{}]*)}/'; // find everything within braces.
preg_match($pattern, $contents, $match);
That pattern doesn't match at all. I am sure it is the last bit that is wrong '{([^{}]*)}/'
since that pattern works when there are no other braces within the body.
By replacing it with:
'{([^}]*)}/';
It matched till the closing }
of the switch inside the if
statement and stopped there (including }
of the switch but excluding that of the if
).
As well as this pattern, same result:
'{(\K[^}]*(?=)})/m';
This works to output header file (.h) out of inline function blocks (.c)
Find Regular expression:
Replace with:
For input:
will output:
Get the body of the function block with second matched pattern :
will output:
Update #2
According to others comments
Note: A short RegEx i.e.
{((?>[^{}]++|(?R))*)}
is enough if you know your input does not contain{
or}
out of PHP syntax.So a long RegEx, in what evil cases does it work?
[{}]
in a string between quotation marks["']
[{}]
in a comment block.//...
or/*...*/
or#...
[{}]
in a heredoc or nowdoc<<<STR
or<<<['"]STR['"]
Otherwise it is meant to have a pair of opening/closing braces and depth of nested braces is not important.
Do we have a case that it fails?
No unless you have a martian that lives inside your codes.
Formatting is done by @sln's RegexFormatter software.
What I provided in live demo?
Laravel's Eloquent Model.php file (~3500 lines) randomly is given as input. Check it out: Live demo