comparison of string against key-value pair in has

2019-09-02 14:42发布

问题:

I have key-value pairs in a hash map in Perl. Assuming all keys are unique.

for example like this :

my %msg_to_number = ( 
 'Hello World, I am XYZ'    => 11,
 'I am using Stack Overflow for Guidance'   => 12,
 'Programming is good!' => 13,
);

now if input strings against which I want to compare is like this:

str1 = Hello World, I am XYZ;
str2 = Hello World, I am XYZ and ABC;

so below code maps str1 to hash map key correctly but for str2 it fails.

My question is how can i modify below code to make it working for case both cases. That is : making code work for str1 and str2 as well. Hash map should return 11 for both str1 and str2. That is even key in hash map matches a part of string in comparison or complete string it should return match. (I am assuming partial match case will occur with words in beginning of sentence to be compared against, this simplifies things a bit)

right now below code makes comparison by removing characters like !, # and so on, converting to lower case and then match.

#!/usr/bin/env perl
use strict;
use warnings;

my %msg_to_number = ( 
 'Hello World, I am XYZ'    => 11,
 'I am using Stack Overflow for Guidance'   => 12,
 'Programming is good!' => 13,
);

my $str_to_match = 'Hello World, I am XYZ!!!!!';
my $transformed_match = $str_to_match =~ s/\W//gr;

my ( $first_match ) = grep { s/\W//gr =~ m/^\Q$transformed_match\E$/i } keys     
%msg_to_number;
print "$first_match   =  $msg_to_number{$first_match}\n";

I have tried playing with regex for above code but was not able to make it work. if someone can suggest some changes or different method( suggestion )to do same will be great. ( original logic which code is doing currently plus partial comparison ). This is a follow up question on stack overflow.

Thanks

Updated: Example of what should match and what should not match.

Assume below hash map: my %msg_to_number = ( 'Hello World, I am XYZ' => 11, 'I am using Stack Overflow for Guidance' => 12, 'Programming is good!' => 13, );

str1 = Hello World, I am XYZ
str2 = Hello World
str3 = Hello World, I am XYZ, ABC and EFG.

so in above str1 and str2 should match whereas str3 no match.

As i said even if starting part is partial match then it should be match.

let me know if this clears the use case

回答1:

I'm not sure what you want to do, but it seems that you need a three way match. If you use REGXs, you need to make sure that you either match all cases you want, or mismatch all cases you do not want. The following script may be closer to what you need. It matches (1) your input string, (2) your HASH KEY, and (3) what you appear to be looking for.

   use strict;
   use warnings;

   my %msg_to_number = ( 
    'Hello World, I am XYZ'                    => 11,
    'I am using Stack Overflow for Guidance'   => 12,
    'Programming is good!'                     => 13,
   );

   while(<DATA>)
   {
       chomp;
       foreach my $k (keys %msg_to_number)
       {
           print "$_, $msg_to_number{$k}\n" if $_ =~ /Hello World/ and $k =~ /Hello World/;
       }
   }
   exit(0);

   __DATA__
   Hello World
   Hello World, I am ABC
   I am using Stack Overflow for Guidance
   Programming is good
   Hello World, I am ABC, DEF, GHI

Here is the output:

Hello World, 11
Hello World, I am ABC, 11
Hello World, I am ABC, DEF, GHI, 11


回答2:

This may be as simple as:

my ( $first_match ) = grep { s/\W//gr =~ m/\Q$transformed_match\E/i } keys %msg_to_number;

Remove the pattern anchors, and provided $transformed_match is a substring of the (transformed) key, then it'll match.

Or you can reverse that - so if the key is the substring, it matches:

#!/usr/bin/env perl
use strict;
use warnings;

my %msg_to_number = ( 
 'Hello World, I am XYZ'    => 11,
 'I am using Stack Overflow for Guidance'   => 12,
 'Programming is good!' => 13,
);

my $str_to_match = 'Hello World, I am XYZ and ABC!!!!!';
my $transformed_match = $str_to_match =~ s/\W//gr;

my ( $first_match ) = grep { my $tr_key = s/\W//gr; $transformed_match =~ m/$tr_key/i or $tr_key =~ m/$transformed_match/ } keys %msg_to_number;
print "$first_match   =  $msg_to_number{$first_match}\n";

(There may be a way to do the transform-and-match within the regex - I'm not 100% sure. But it's probably not a great idea anyway!)