Ruby str.match(regex) returns MatchData containing

2019-06-24 08:26发布

问题:

Using Ruby 2.2

I have strings like following:

  • Weekly on Tuesday and Friday
  • Weekly on Monday, Wednesday and Saturday
  • Monthly every 2 weeks on Monday

To extract the days of week from above shown strings I have written following regex:

/\b(Monday|Tuesday|Wednesday|Thursday|Friday|Saturday|Sunday)\b/

When trying to use String#match instance method the match_data doesn't return all matches. For e.g. please refer the irb output shown below wherein when the string Weekly on Tuesday and Friday is matched against above shown regex the MatchData contains just Tuesday. I expected it to contain Friday too.

  2.2.1 :001 > str = "Weekly on Tuesday and Friday"
  => "Weekly on Tuesday and Friday" 
  2.2.1 :002 > regex = /\b(Monday|Tuesday|Wednesday|Thursday|Friday|Saturday|Sunday)\b/
  => /\b(Monday|Tuesday|Wednesday|Thursday|Friday|Saturday|Sunday)\b/ 
  2.2.1 :003 > str.match(regex)
  => #<MatchData "Tuesday" 1:"Tuesday"> 
  2.2.1 :004 > match_data = str.match(regex)
  => #<MatchData "Tuesday" 1:"Tuesday"> 
  2.2.1 :005 > match_data.captures
  => ["Tuesday"] 

Can anybody please explain me why the MatchData contains only the first matched term when I haven't used any start/end anchors in my Regex? I am sure my regex misses something but I am unable to figure out.

Note

Rubular shows correct match groups for the same regex as can be seen at http://rubular.com/r/XZmrHPkjEk

回答1:

It seems that MatchData returned by .match() method only returns the first match with all captured groups if any. I have just tested it and I only could get 1 match with .match().

See Regular-Expressions.info details:

To test if a particular regex matches (part of) a string, you can either use the =~ operator, call the regexp object's match() method, e.g.: print "success" if subject =~ /regex/ or print "success" if /regex/.match(subject).

Also, from here:

String.=~(Regexp) returns the starting position of the first match or nil if no match was found

To obtain all matches, you need to use .scan() method.