Selecting highest count of element except when…

2019-09-01 01:43发布

问题:

So i have been working on this perl script that will analyze and count the same letters in different line spaces. I have implemented the count to a hash but am having trouble excluding a " - " character from the output results of this hash. I tried using delete command or next if, but am not getting rid of the - count in the output.

So with this input:

@extract = ------------------------------------------------------------------MGG-------------------------------------------------------------------------------------

And following code:

#Count selected amino acids.
my %counter = ();
foreach my $extract(@extract) {
#next if $_ =~ /\-/; #This line code does not function correctly.  
$counter{$_}++;

}


sub largest_value_mem (\%) {
my $counter   = shift;
my ($key, @keys) = keys   %$counter;
my ($big, @vals) = values %$counter;

for (0 .. $#keys) {
    if ($vals[$_] > $big) {
        $big = $vals[$_];
        $key = $keys[$_];
    }
}
$key

}

I expect the most common element to be G, same as the output. If there is a tie in the elements, say G = M, if there is a way to display both in that would be great but not necessary. Any tips on how to delete or remove the '-' is much appreciated. I am slowly learning perl language.

Please let me know if what I am asking is not clear or if more information is needed, thanks again kindly for all the comments.

回答1:

Your data doesn't entirely make sense, since it's not actually working perl code. I'm guessing that it's a string divided into characters. After that it sounds like you just want to be able to find the highest frequency character, which is essentially just a sort by descending count.

Therefore the following demonstrates how to count your characters and then sort the results:

use strict;
use warnings;

my $str = '------------------------------------------------------------------MGG-------------------------------------------------------------------------------------';

my @chars = split '', $str;

#Count Characteres
my %count;
$count{$_}++ for @chars;
delete $count{'-'}; # Don't count -

# Sort keys by count descending
my @keys = sort {$count{$b} <=> $count{$a}} keys %count;

for my $key (@keys) {
    print "$key $count{$key}\n";
}

Outputs:

G 2
M 1


回答2:

foreach my $extract(@extract) {
#next if $_ =~ /\-/

$_ setting is suppressed by $extract here. (In this case, $_ keeps value from above, e.g. routine argument list, previous match, etc.)

Also, you can use character class for better readability:

next if $extract=~/[-]/;