Ruby : Choosing between each, map, inject, each_wi

2019-04-22 13:15发布

问题:

When I started writing Ruby many years ago, it took me a while to understand the difference between each and map. It only got worse when I discovered all the other Enumerable and Array methods.

With the help of the official documentation and many StackOverflow questions, I slowly began to understand what those methods did.

Here is what took me even longer to understand though :

  • Why should I use one method or another?
  • Are there any guidelines?

I hope this question isn't a duplicate : I'm more interested in the "Why?" than the "What?" or "How?", and I think it could help Ruby newcomers.

回答1:

A more tl;dr answer:

How to choose between each, map, inject, each_with_index and each_with_object?

  • Use #each when you want "generic" iteration and don't care about the result. Example - you have numbers, you want to print the absolute value of each individual number:

    numbers.each { |number| puts number.abs }
    
  • Use #map when you want a new list, where each element is somehow formed by transforming the original elements. Example - you have numbers, you want to get their squares:

    numbers.map { |number| number ** 2 }
    
  • Use #inject when you want to somehow reduce the entire list to one single value. Example - you have numbers, you want to get their sum:

    numbers.inject(&:+)
    
  • Use #each_with_index in the same situation as #each, except you also want the index with each element:

    numbers.each_with_index { |number, index| puts "Number #{number} is on #{index} position" }
    
  • Uses for #each_with_object are more limited. The most common case is if you need something similar to #inject, but want a new collection (as opposed to singular value), which is not a direct mapping of the original. Example - number histogram (frequencies):

    numbers.each_with_object({}) { |number, histogram| histogram[number] = histogram[number].to_i.next }
    


回答2:

Which object can I use?

First, the object you're working with should be an Array, a Hash, a Set, a Range or any other object that respond to each. If it doesn't, it might be converted to something that will. You cannot call each directly on a String for example, because you need to specify if you'd like to iterate over each byte, character or line.

"Hello World".respond_to?(:each)
#=> false
"Hello World".each_char.respond_to?(:each) 
#=> true

I want to calculate something with each element, just like with a for loop in C or Java.

If you want to iterate over each element, do something with it and not modify the original object, you can use each. Please keep reading though, in order to know if you really should.

array = [1,2,3]

#NOTE: i is a bound variable, it could be replaced by anything else (x, n, element). It's a good idea to use a descriptive name if you can
array.each do |i|
  puts "La"*i
end
#=> La
#   LaLa
#   LaLaLa

It is the most generic iteration method, and you could write any of the other mentioned methods with it. We will actually, for pedagogical purposes only. If you spot a similar pattern in your code, you could probably replace it with the corresponding method.

It is basically never wrong to use each, it is almost never the best choice though. It is verbose and not Ruby-ish.

Note that each returns the original object, but this is rarely (never?) used. The logic happens inside the block, and should not modify the original object.

The only time I use each is:

  • when no other method would do. The more I learn about Ruby, the less often it happens.
  • when I write a script for someone who doesn't know Ruby, has some programming experience (e.g. C, Fortran, VBA) and would like to understand my code.

I want to get an Array out of my String/Hash/Set/File/Range/ActiveRecord::Relation

Just call object.to_a.

(1..10).to_a
#=> [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
"Hello world".each_char.to_a
#=> ["H", "e", "l", "l", "o", " ", "w", "o", "r", "l", "d"]
{:a => 1, :b => 2}.to_a
#=> [[:a, 1], [:b, 2]]
Movie.all.to_a #NOTE: Probably very inefficient. Try to keep an ActiveRecord::Relation as Relation for as long as possible.
#=> [Citizen Kane, Trois couleurs: Rouge, The Grapes of Wrath, ....

Some methods described below (e.g. compact, uniq) are only defined for Arrays.

I want to get a modified Array based on the original object.

If you want to get an Array based on the original object, you can use map. The returned object will have the same size as the original one.

array = [1,2,3]

new_array = array.map do |i|
  i**2
end
new_array
#=> [1, 4, 9]

#NOTE: map is often used in conjunction with other methods. Here is the corresponding one-liner, without creating a new variable :
array.map{|i| i**2}
#=> [1, 4, 9]

# EACH-equivalent (For pedagogical purposes only):
new_array = []
array.each do |i|
  new_array << i**2
end
new_array
#=> [1, 4, 9]

The returned Array will not replace the original object.

This method is very widely used. It should be the first one you learn after each.

collect is a synonym of map. Make sure to use only one of both in your projects.

I want to get a modified Hash based on the original Hash.

If your original object is a Hash, map will return an Array anyway. If you want a Hash back :

hash = {a: 1, b: 2}
hash.map{|key, value| [key, value*2]}.to_h
#=> {:a=>2, :b=>4}

# EACH-equivalent
hash = {a: 1, b: 2}
new_hash = {}
hash.each do |key,value|
  new_hash[key]=value*2
end
new_hash
#=> {:a=>2, :b=>4}

I want to filter some elements.

I want to remove nil elements

You can call compact. It will return a new Array without the nil elements.

array = [1,2,nil,4,5]

#NOTE: array.map{|i| i*2} Would raise a NoMethodError
array.compact
# => [1, 2, 4, 5]

# EACH-equivalent
new_array = []
array.each do |integer_or_nil|
  new_array << integer_or_nil unless integer_or_nil.nil?
end
new_array

I want to write some logic to determine if an element should be kept in the new Array

You can use select or reject.

integers = (1..10)
integers.select{|i| i.even?}
# => [2, 4, 6, 8, 10]
integers.reject{|i| i.odd?}
# => [2, 4, 6, 8, 10]

# EACH-equivalent
new_array = []
integers.each do |i|
    new_array << i if i.even?
end
new_array

I want to remove duplicate elements from your Array

You can use uniq :

letters = %w(a b a b c)
letters.uniq
#=> ["a", "b", "c"]

# EACH-equivalent
uniq_letters = []
letters.each do |letter|
  uniq_letters << letter unless uniq_letters.include?(letter)
end
uniq_letters

#TODO: Add find/detect/any?/all?/count
#TODO: Add group_by/sort/sort_by

I want to iterate over all the elements while counting from 0 to n-1

You can use each_with_index :

letters = %w(a b c)
letters.each_with_index do |letter, i|
  puts "Letter ##{i} : #{letter}"
end
#=> Letter #0 : a
#   Letter #1 : b
#   Letter #2 : c

#NOTE: There's a nice Ruby syntax if you want to use each_with_index with a Hash
hash = {:a=>1, :b=>2}
hash.each_with_index{|(key,value),i| puts "#{i} : #{key}->#{value}"}
# => 0 : a->1
#    1 : b->2

# EACH-equivalent
i = 0
letters.each do |letter|
  puts "Letter ##{i} : #{letter}"
  i+=1
end

each_with_index returns the original object.

I want to iterate over all the elements while setting a variable during each iteration and using it in the next iteration.

You can use inject :

gauss = (1..100)
gauss.inject{|sum, i| sum+i}
#=> 5050
#NOTE: You can specify a starting value with gauss.inject(0){|sum, i| sum+i}

# EACH-equivalent
sum = 0
gauss.each do |i|
  sum = sum + i
end
puts sum

It returns the variable as defined by the last iteration.

reduce is a synonym. As with map/collect, choose one keyword and keep it.

I want to iterate over all the elements while keeping a variable available to each iteration.

You can use each_with_object :

letter_ids = (1..26)

letter_ids.each_with_object({}){|i,alphabet| alphabet[("a".ord+i-1).chr]=i}
#=> {"a"=>1, "b"=>2, "c"=>3, "d"=>4, "e"=>5, "f"=>6, "g"=>7, "h"=>8, "i"=>9, "j"=>10, "k"=>11, "l"=>12, "m"=>13, "n"=>14, "o"=>15, "p"=>16, "q"=>17, "r"=>18, "s"=>19, "t"=>20, "u"=>21, "v"=>22, "w"=>23, "x"=>24, "y"=>25, "z"=>26}

# EACH-equivalent
alphabet = {}
letter_ids.each do |i|
  letter = ("a".ord+i-1).chr
  alphabet[letter]=i
end
alphabet

It returns the variable as modified by the last iteration. Note that the order of the two block variables is reversed compared to inject.

If your variable is a Hash, you should probably prefer this method to inject, because h["a"]=1 returns 1, and it would require one more line in your inject block to return a Hash.

I want something that hasn't been mentioned yet.

Then it's probably okay to use each ;)

Notes :

It's a work in progress, and I would gladly appreciate any feedback. If it's interesting enough and fit in one page, I might extract a flowchart out of it.