In Perl, how do I sort by frequency of a value?

2019-06-21 11:43发布

问题:

I am trying to create a program to count the different values that occur in a column of a data file. So, it would be something like, if the possible values of a column are A, B, C. The output is something like

A   456
B   234
C   344

I have been able to get the running counts of A, B and C easily by doing something like this

my %count; 
for my $f (@ffile) {

    open F, $f || die "Cannot open $f: $!";

    while (<F>) {
       chomp;
       my @U = split / /;

       $count{$U[2]}++; 
    }

}
   foreach my $w (sort keys %count) {
         printf $w\t$count{$w};
     }

For instance here I am counting the second column of the file in the path given.

How do I sort the output of the printf by the counts rather than the keys (or values A, B, C) to get -

A   456
C   344
B   234

回答1:

for my $w (sort {$count{$b} <=> $count{$a}} keys %count) {
    print "$w\t$count{$w}\n";
}


回答2:

This is a FAQ:

perldoc -q sort

use warnings;
use strict;

my %count = (
    A => 456,
    B => 234,
    C => 344
);

for my $w (sort { $count{$b} <=> $count{$a} } keys %count) {
    print "$w\t$count{$w}\n";
}

__END__
A       456
C       344
B       234


回答3:

Some additional comments:

The output is something like...by doing something like this

You help us help you if you paste your actual code, abbreviated where possible. When people recreate their actual code, they often obscure or omit the very source of their problem.

   chomp;
   my @U = split / /;

This splits on space characters and looks for the count after the second space; it's often easier to do:

   my @U = split ' ';

split used with a constant space instead of a regex splits on any sequence of whitespace, like split /\s+/ except that it ignores trailing whitespace...this is a common enough thing to do that there is this special syntax for it. Note that the chomp becomes unnecessary.