Perl regular expression: match nested brackets

2020-02-03 07:33发布

I'm trying to match nested {} brackets with a regular expressions in Perl so that I can extract certain pieces of text from a file. This is what I have currently:

my @matches = $str =~ /\{(?:\{.*\}|[^\{])*\}|\w+/sg;

foreach (@matches) {
    print "$_\n";
}

At certain times this works as expected. For instance, if $str = "abc {{xyz} abc} {xyz}" I obtain:

abc
{{xyz} abc}
{xyz}

as expected. But for other input strings it does not function as expected. For example, if $str = "{abc} {{xyz}} abc", the output is:

{abc} {{xyz}}
abc

which is not what I expected. I would have wanted {abc} and {{xyz}} to be on separate lines, since each is balanced on its own in terms of brackets. Is there an issue with my regular expression? If so, how would I go about fixing it?

标签: regex perl
7条回答
别忘想泡老子
2楼-- · 2020-02-03 08:12

One way using the built-in module Text::Balanced.

Content of script.pl:

#!/usr/bin/env perl

use warnings;
use strict;
use Text::Balanced qw<extract_bracketed>;

while ( <DATA> ) { 

    ## Remove '\n' from input string.
    chomp;

    printf qq|%s\n|, $_; 
    print "=" x 20, "\n";


    ## Extract all characters just before first curly bracket.
    my @str_parts = extract_bracketed( $_, '{}', '[^{}]*' );

    if ( $str_parts[2] ) { 
        printf qq|%s\n|, $str_parts[2];
    }   

    my $str_without_prefix = "@str_parts[0,1]";


    ## Extract data of balanced curly brackets, remove leading and trailing
    ## spaces and print.
    while ( my $match = extract_bracketed( $str_without_prefix, '{}' ) ) { 
        $match =~ s/^\s+//;
        $match =~ s/\s+$//;
        printf qq|%s\n|, $match;

    }   

    print "\n";
}

__DATA__
abc {{xyz} abc} {xyz}
{abc} {{xyz}} abc

Run it like:

perl script.pl

That yields:

abc {{xyz} abc} {xyz}
====================
abc 
{{xyz} abc}
{xyz}

{abc} {{xyz}} abc
====================
{abc}
{{xyz}}
查看更多
登录 后发表回答