Convert named matches in MatchData to Hash

2020-06-08 06:42发布

I have a rather simple regexp, but I wanted to use named regular expressions to make it cleaner and then iterate over results.

Testing string:

testing_string = "111x222b333"

My regexp:

regexp = %r{
                (?<width> [0-9]{3} ) {0}
                (?<height> [0-9]{3} ) {0}
                (?<depth> [0-9]+ ) {0}

                \g<width>x\g<height>b\g<depth>
            }x
dimensions = regexp.match(testing_string)

This work like a charm, but heres where the problem comes:

dimensions.each { |k, v| dimensions[k] = my_operation(v) }

# ERROR !

 undefined method `each' for #<MatchData "111x222b333" width:"111" height:"222" depth:"333">.

There is no .each method in MatchData object, and I really don't want to monkey patch it.

How can I fix this problem ?

I wasn't as clear as I thought: the point is to keep names and hash-like structure.

标签: ruby regex hash
5条回答
Animai°情兽
2楼-- · 2020-06-08 07:12

@Phrogz's answer is correct if all of your captures have unique names, but you're allowed to give multiple captures the same name. Here's an example from the Regexp documentation.

This code supports captures with duplicate names:

captures = Hash[
  dimensions.regexp.named_captures.map do |name, indexes|
    [
      name,
      indexes.map { |i| dimensions.captures[i - 1] }
    ]
  end
]

# Iterate over the captures
captures.each do |name, values|
  # name is a String
  # values is an Array of Strings
end
查看更多
一夜七次
3楼-- · 2020-06-08 07:16

If you want to keep the names, you can do

new_dimensions = {}
dimensions.names.each { |k| new_dimensions[k] = my_operation(dimensions[k]) }
查看更多
男人必须洒脱
4楼-- · 2020-06-08 07:26

So today a new Ruby version (2.4.0) was released which includes many new features, amongst them feature #11999, aka MatchData#named_captures. This means you can now do this:

h = '12'.match(/(?<a>.)(?<b>.)(?<c>.)?/).named_captures
#=> {"a"=>"1", "b"=>"2", "c"=>nil}
h.class
#=> Hash

So in your code change

dimensions = regexp.match(testing_string)

to

dimensions = regexp.match(testing_string).named_captures

And you can use the each method on your regex match result just like on any other Hash, too.

查看更多
Fickle 薄情
5楼-- · 2020-06-08 07:26

I'd attack the whole problem of creating the hash a bit differently:

irb(main):052:0> testing_string = "111x222b333"
"111x222b333"
irb(main):053:0> hash = Hash[%w[width height depth].zip(testing_string.scan(/\d+/))]
{
    "width" => "111",
    "height" => "222",
    "depth" => "333"
}

While regex are powerful, their siren-call can be too alluring, and we get sucked into trying to use them when there are more simple, or straightforward, ways of accomplishing something. It's just something to think about.


To keep track of the number of elements scanned, per the OPs comment:

hash = Hash[%w[width height depth].zip(scan_result = testing_string.scan(/\d+/))]
=> {"width"=>"111", "height"=>"222", "depth"=>"333"}
scan_result.size
=> 3

Also hash.size will return that, as would the size of the array containing the keys, etc.

查看更多
贼婆χ
6楼-- · 2020-06-08 07:31

If you need a full Hash:

captures = Hash[ dimensions.names.zip( dimensions.captures ) ]
p captures
#=> {"width"=>"111", "height"=>"222", "depth"=>"333"}

If you just want to iterate over the name/value pairs:

dimensions.names.each do |name|
  value = dimensions[name]
  puts "%6s -> %s" % [ name, value ]
end
#=>  width -> 111
#=> height -> 222
#=>  depth -> 333

Alternatives:

dimensions.names.zip( dimensions.captures ).each do |name,value|
  # ...
end

[ dimensions.names, dimensions.captures ].transpose.each do |name,value|
  # ...
end

dimensions.names.each.with_index do |name,i|
  value = dimensions.captures[i]
  # ...
end
查看更多
登录 后发表回答