Return similar elements of array in Ruby

2019-09-02 19:16发布

问题:

Say I have such an array:

arr = ['footballs_jumba_10', 'footballs_jumba_11', 'footballs_jumba_12',
       'footballs_jumba_14', 'alpha_romeo_11', 'alpha_romeo_12',
       'alpha_juliet_10', 'alpha_juliet_11']

If I wanted to return duplicates, (assuming any of these strings in the array were exactly identical, I would just

return arr.detect{ |a| arr.count(a) > 1 }

but, what if I wanted to get only duplicates of the first 10 characters of each element of the array, without knowing the variations beforehand? Like this:

['footballs_', 'alpha_rome', 'alpha_juli']

回答1:

This is quite straightforward with the method Arry#difference that I proposed in my answer here:

arr << "Let's add a string that appears just once"
  #=> ["footballs_jumba_10", "footballs_jumba_11", "footballs_jumba_12",
  #    "footballs_jumba_14", "alpha_romeo_11", "alpha_romeo_12",
  #    "alpha_juliet_10", "alpha_juliet_11", "Let's add a string that appears just once"]

a = arr.map { |s| s[0,10] }
  #=> ["footballs_", "footballs_", "footballs_", "footballs_", "alpha_rome",
  #    "alpha_rome", "alpha_juli", "alpha_juli", "Let's add "] 
b = a.difference(a.uniq)
  #=> ["footballs_", "footballs_", "footballs_", "alpha_rome", "alpha_juli"] 
b.uniq
  #=> ["footballs_", "alpha_rome", "alpha_juli"] 


回答2:

Use Array#uniq:

arr.map {|e| e[0..9]}.uniq
# => ["footballs_", "alpha_rome", "alpha_juli"]


回答3:

You could do something like this:

def partial_duplicates(elements)
  unique = {}
  duplicates = {}

  elements.each do |e|
    partial = e[0..9]

      # If the element is in the hash, it is a duplicate.
      if first_element = unique[partial]
        duplicates[first_element] = true
        duplicates[e] = true
      else
        # include the element as unique
        unique[partial] = e
      end
  end

  duplicates.keys
end

This will return unique duplicates. If you want all the duplicates, you can just use an Array.

Also, this returns all the full representations of each duplicate as it seems more useful and probably what you want:

partial_duplicates(arr)
=> ["footballs_jumba_10", "footballs_jumba_11", "footballs_jumba_12", "footballs_jumba_14", "alpha_romeo_11", "alpha_romeo_12", "alpha_juliet_10", "alpha_juliet_11"]

If you want only the partial duplicates you can change the condition to:

if unique[partial]
  duplicates[partial] = true
else
  unique[partial] = true
end

then:

partial_duplicates(arr)
=> ["footballs_", "alpha_rome", "alpha_juli"]