I have the following Perl script counting the number of Fs and Ts in a string:
my $str = "GGGFFEEIIEETTGGG";
my $ft_count = 0;
$ft_count++ while($str =~ m/[FT]/g);
print "$ft_count\n";
Is there a more concise way to get the count (in other words, to combine line 2 and 3)?
my $ft_count = $str =~ tr/FT//;
See perlop.
If the REPLACEMENTLIST is empty, the
SEARCHLIST is replicated. This latter is useful for counting
characters in a class …
$cnt = $sky =~ tr/*/*/; # count the stars in $sky
$cnt = tr/0-9//; # count the digits in $_
Here's a benchmark:
use strict; use warnings;
use Benchmark qw( cmpthese );
my ($x, $y) = ("GGGFFEEIIEETTGGG" x 1000) x 2;
cmpthese -5, {
'tr' => sub {
my $cnt = $x =~ tr/FT//;
},
'm' => sub {
my $cnt = ()= $y =~ m/[FT]/g;
},
};
Rate tr m
Rate m tr
m 108/s -- -99%
tr 8118/s 7440% --
With ActiveState Perl 5.10.1.1006 on 32 Windows XP.
The difference seems to be starker with
C:\Temp> c:\opt\strawberry-5.12.1\perl\bin\perl.exe t.pl
Rate m tr
m 88.8/s -- -100%
tr 25507/s 28631% --
Yes, you can use the CountOf secret operator:
my $ft_count = ()= $str =~ m/[FT]/g;
When the "m" operator has the /g flag AND is executed in list context, it returns a list of matching substrings. So another way to do this would be:
my @ft_matches = $str =~ m/[FT]/g;
my $ft_count = @ft_matches; # count elements of array
But that's still two lines. Another weirder trick that can make it shorter:
my $ft_count = () = $str =~ m/[FT]/g;
The "() =" forces the "m" to be in list context. Assigning a list with N elements to a list of zero variables doesn't actually do anything. But then when this assignment expression is used in a scalar context ($ft_count = ...), the right "=" operator returns the number of elements from its right-hand side - exactly what you want.
This is incredibly weird when first encountered, but the "=()=" idiom is a useful Perl trick to know, for "evaluate in list context, then get size of list".
Note: I have no data on which of these are more efficient when dealing with large strings. In fact, I suspect your original code might be best in that case.
You can combine line 2, 3 and 4 into one like so:
my $str = "GGGFFEEIIEETTGGG";
print $str =~ s/[FT]//g; #Output 4;