Find keep duplicates in Ruby hashes

2019-02-17 09:37发布

问题:

I have an array of hashes where I need to find and store matches based on one matching value between the hashes.

a = [{:id => 1, :name => "Jim", :email => "jim@jim.jim"}, 
     {:id => 2, :name => "Paul", :email => "paul@paul.paul"}, 
     {:id => 3, :name => "Tom", :email => "tom@tom.tom"}, 
     {:id => 1, :name => "Jim", :email => "jim@jim.jim"}, 
     {:id => 5, :name => "Tom", :email => "tom@tom.tom"}, 
     {:id => 6, :name => "Jim", :email => "jim@jim.jim"}]

So I would want to return

b = [{:id => 1, :name => "Jim", :email => "jim@jim.jim"},  
     {:id => 3, :name => "Tom", :email => "tom@tom.tom"}, 
     {:id => 5, :name => "Tom", :email => "tom@tom.tom"}, 
     {:id => 6, :name => "Jim", :email => "jim@jim.jim"}]

Notes: I can sort the data (csv) by :name after the fact so they don't have to be nicely grouped, just accurate. Also it's not necessary two of the same, it could be 3 or 10 or more.

Also, the data is about 22,000 rows.

回答1:

I tested this and it will do exactly what you want:

b = a.group_by { |h| h[:name] }.values.select { |a| a.size > 1 }.flatten

However, you might want to look at some of the intermediate objects produced in that calculation and see if those are more useful to you.



标签: ruby arrays hash