I would like to iterate matches over a text, the blocks I want to match start with a number then a tab character.
My beginning match is ^\d+\t
, but is there a way to indicate that I want all text including this match up until the next match?
Input data:
1 111.111.111.111
111.111.111.111
Host IP 111.111.111.111
111.111.111.111
111.111.111.111 Host IP TCP app 11111, 11111, 11111, 11111 Allow
2 111.111.111.111
111.111.111.111
111.111.111.111 Host IP 111.111.111.111
111.111.111.111 Host IP TCP app 11111, 11111, 11111, 11111 Allow
3 111.111.111.111
111.111.111.111 Host IP 111.111.111.111
111.111.111.111
111.111.111.111
111.111.111.111 Host IP TCP app 11111, 11111, 11111, 11111 Allow
4 111.111.111.111
111.111.111.111
111.111.111.111
111.111.111.111 Host IP 111.111.111.111
111.111.111.111 Host IP TCP app 11111, 11111, 11111, 11111 Allow
I'm using Perl.
The following regex should do what you want:
^\d+\t(?:[^\d]+|[\d]+(?!\t))*
This will match some number of digits followed by a tab, and then any number of non-digits or digits that are not followed by a tab.
my @matches = $data =~ /^\d+\t(?:[^\d]+|[\d]+(?!\t))*/mg;
edit: Okay this one should work!
Probably, this?
/^\d+\t.*?(?:\z|^\d+\t)/ms
while (/
\G
( \d+\t )
( (?: (?! \d+\t ) . )* )
/xg) {
print("match: $1\n");
print("buffer: $2\n");
}
Sample input and expected results would help, as it is I'm not really sure I know what your looking for.
If your just matching on one pattern you might be able to split the string:
my $string = "text\n1\ttest\n2\tend\n";
my @matches = split /^(\d+)\t/m, $string;
shift @matches; # remove the text before the first number
print "[$_]\n" for @matches;
__END__
Output:
[1]
[test
]
[2]
[end
]
If your matching multiple patterns Perl has special variables that can let you find where a match starts and finishes. Which can be used to extract what was between two matches.
use English qw(-no_match_vars);
my $string = "1\ttestEND\n2\ttextEND\n";
if ($string =~ /^\d+\t/) {
my $last_match_end = $LAST_MATCH_END[0];
if ($string =~ /END/cg) {
my $last_match_start = $LAST_MATCH_START[0];
my $len = $last_match_start - $last_match_end;
print substr($string, $last_match_end, $len) . "\n"
}
}
__END__
Output:
test