string comparison for hash keys in perl

2019-03-02 01:13发布

问题:

I am having an hash map in perl like this:

MAP_MESSAGE_TO_NUMBER => {
     'Hello World, I am XYZ'    => 11,
     'I am using Stack Overflow for Guidance'   => 12,
     'Programming is good!' => 13,
},

in my Perl code i am trying to make a match with hash key and if match happens i just return the corresponding hash value ( number ).

My code is working fine.

my $Strtomatch = 'Hello World, I am XYZ!';
if ( some condition ) {
    my $val =   MAP_MESSAGE_TO_NUMBER->{$Strtomatch};
    # some code will use the return value 
    doSomethingWith $val;  
}

My question here is: if you will see the value of variable $Strtomatch has one extra character '!' which is not present in original hash map. So because of this my hash map does not returns any value.

My question is how can i make this more generic so that i can make comparison even if part of strings match.

even few characters match its fine to return the value.

Let me know.

I am not really sure how i can use regular expression here because i am doing comparison for a value in hash map with a value coming from other function. I am very technical and not too good with programming, trying things to learn.

回答1:

Here's one way you could do it:

#!/usr/bin/env perl
use strict;
use warnings;

my %msg_to_number = ( 
     'Hello World, I am XYZ'    => 11,
     'I am using Stack Overflow for Guidance'   => 12,
     'Programming is good!' => 13,
);

my $str_to_match = 'Hello World, I am XYZ!';
#note - grep returns a list. We chuck any duplicate hits away. 
my ( $first_match ) = grep { $str_to_match =~ m/\Q$_\E/ } keys %msg_to_number;

print "$first_match   =  $msg_to_number{$first_match}\n";

Note - the pattern match in the grep is reversed. You check if your string matches a key in the hash, and return that if it does. It'll only work if your key is a substring (or exact match) of your primary string.

And it only gets a 'first' match, so practically speaking - if there are duplicates, the result will be random. So make sure your hash keys are sufficiently unique.

E.g:

my $str_to_match = 'Hello World, I am XYZ!Programming is good!!!!!one';
my ( $first_match ) = grep { $str_to_match =~ m/\Q$_\E/ } keys %msg_to_number;
print "$first_match   =  $msg_to_number{$first_match}\n";

Will give your randomly:

Programming is good!   =  13
Hello World, I am XYZ   =  11

As an alternative - one possiblity is perform a common transform on input/output, which makes it 'blind' to the differences.

E.g.

#!/usr/bin/env perl
use strict;
use warnings;

my %msg_to_number = ( 
     'Hello World, I am XYZ'    => 11,
     'I am using Stack Overflow for Guidance'   => 12,
     'Programming is good!' => 13,
);

my $str_to_match = 'Hello World, I am XYZ!!!!!';
my $transformed_match = $str_to_match =~ s/\W//gr;

my ( $first_match ) = grep { s/\W//gr =~ m/^\Q$transformed_match\E$/i } keys %msg_to_number;
print "$first_match   =  $msg_to_number{$first_match}\n";

This strips \W which is "not word" characters (like punctuation and whitespace) and compares the two like that. It means your matches are a bit fuzzier, and will allow arbitrary exclamation marks, spacing etc.

If you want to handle a default case, then the very handy. // operator is what you want.

return $msg_to_number{$first_match} // "default value here " ;

(or you can just test defined on $first_match)

For case insentive matching, the i modifier to the regex will do the trick as in the second example.



回答2:

You can compile a regex outside of an =~ operator using the qr quote-like operator. The downside to this approach is that now you have to iterate over the search keys to see if any patterns match. It will be much slower than a simple hash lookup.

use constant MAP_MESSAGE_TO_NUMBER => (
  [qr/Hello World, I am XYZ/,                  11],
  [qr/I am using Stack Overflow for Guidance/, 12],
  [qr/Programming is good!/,                   13],
);

my $Strtomatch = 'Hello World, I am XYZ!';
if ($some_condition) {
  foreach my $map (MAP_MESSAGE_TO_NUMBER) {
    my ($pattern, $val) = @$map;
    if ($Strtomatch =~ $pattern) {
      # some code will use the return value 
      doSomethingWith $val;
      # optionally exit the loop at this point with `last`, or store multiple match results 
    }
  }
  # optionally check if any match was found and print an error if not
}

We can't use a hash as the primary datastruture because our regexes will become unblessed, so I used an array of arrays here. If you want to use a hash you can take a look at Tie::RegexpHash and/or Tie::Hash::Regex.



标签: regex perl hash