可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
I am trying to do a split on a string with comma delimiter
my $string='ab,12,20100401,xyz(A,B)';
my @array=split(',',$string);
If I do a split as above the array will have values
ab
12
20100401
xyz(A,
B)
I need values as below.
ab
12
20100401
xyz(A,B)
(should not split xyz(A,B) into 2 values)
How do I do that?
回答1:
use Text::Balanced qw(extract_bracketed);
my $string = "ab,12,20100401,xyz(A,B(a,d))";
my @params = ();
while ($string) {
if ($string =~ /^([^(]*?),/) {
push @params, $1;
$string =~ s/^\Q$1\E\s*,?\s*//;
} else {
my ($ext, $pre);
($ext, $string, $pre) = extract_bracketed($string,'()','[^()]+');
push @params, "$pre$ext";
$string =~ s/^\s*,\s*//;
}
}
This one supports:
- nested parentheses;
- empty fields;
- strings of any length.
回答2:
Here is one way that should work.
use Regexp::Common;
my $string = 'ab,12,20100401,xyz(A,B)';
my @array = ($string =~ /(?:$RE{balanced}{-parens=>'()'}|[^,])+/g);
Regexp::Common can be installed from CPAN.
There is a bug in this code, coming from the depths of Regexp::Common. Be warned that this will (unfortunately) fail to match the lack of space between ,,
.
回答3:
Limit the number of elements it can be split into:
split(',', $string, 4)
回答4:
Here's another way:
my $string='ab,12,20100401,xyz(A,B)';
my @array = ($string =~ /(
[^,]*\([^)]*\) # comma inside parens is part of the word
|
[^,]*) # split on comma outside parens
(?:,|$)/gx);
Produces:
ab
12
20100401
xyz(A,B)
回答5:
Here is my attempt. It should handle depth well and could even be extended to include other bracketed symbols easily (though harder to be sure that they MATCH). This method will not in general work for quotation marks rather than brackets.
#!/usr/bin/perl
use strict;
use warnings;
my $string='ab,12,20100401,xyz(A(2,3),B)';
print "$_\n" for parse($string);
sub parse {
my ($string) = @_;
my @fields;
my @comma_separated = split(/,/, $string);
my @to_be_joined;
my $depth = 0;
foreach my $field (@comma_separated) {
my @brackets = $field =~ /(\(|\))/g;
foreach (@brackets) {
$depth++ if /\(/;
$depth-- if /\)/;
}
if ($depth == 0) {
push @fields, join(",", @to_be_joined, $field);
@to_be_joined = ();
} else {
push @to_be_joined, $field;
}
}
return @fields;
}
回答6:
Well, old question, but I just happened to wrestle with this all night, and the question was never marked answered, so in case anyone arrives here by Google as I did, here's what I finally got. It's a very short answer using only built-in PERL regex features:
my $string='ab,12,20100401,xyz(A,B)';
string =~ 's/((\((?>[^)(]*(?2)?)*\))|[^,()]*)(*SKIP)([,])/$1\n/g';
my @array=split('\n',$string);
Commas that are not inside parentheses are changed to newlines and then the array is split on them. This will ignore commas inside any level of nested parentheses, as long as they're properly balanced with a matching number of open and close parens.
This assumes you won't have newline \n
characters in the initial value of $string. If you need to, either temporarily replace them with something else before the substitution line and then use a loop to replace back after the split
, or just pick a different delimiter to split the array on.