Ruby method Array#<< not updating the array

2019-01-07 13:56发布

问题:

Inspired by How can I marshal a hash with arrays? I wonder what's the reason that Array#<< won't work properly in the following code:

h = Hash.new{Array.new}
#=> {}
h[0]
#=> []
h[0] << 'a'
#=> ["a"]
h[0]
#=> [] # why?!
h[0] += ['a']
#=> ["a"]
h[0]
#=> ["a"] # as expected

Does it have to do with the fact that << changes the array in-place, while Array#+ creates a new instance?

回答1:

If you create a Hash using the block form of Hash.new, the block gets executed every time you try to access an element which doesn't actually exist. So, let's just look at what happens:

h = Hash.new { [] }
h[0] << 'a'

The first thing that gets evaluated here, is the expression

h[0]

What happens when it gets evaluated? Well, the block gets run:

[]

That's not very exciting: the block simply creates an empty array and returns it. It doesn't do anything else. In particular, it doesn't change h in any way: h is still empty.

Next, the message << with one argument 'a' gets sent to the result of h[0] which is the result of the block, which is simply an empty array:

[] << 'a'

What does this do? It adds the element 'a' to an empty array, but since the array doesn't actually get assigned to any variable, it is immediately garbage collected and goes away.

Now, if you evaluate h[0] again:

h[0] # => []

h is still empty, since nothing ever got assigned to it, therefore the key 0 is still non-existent, which means the block gets run again, which means it again returns an empty array (but note that it is a completely new, different empty array now).

h[0] += ['a']

What happens here? First, the operator assign gets desugared to

h[0] = h[0] + ['a']

Now, the h[0] on the right side gets evaluated. And what does it return? We already went over this: h[0] doesn't exist, therefore the block gets run, the block returns an empty array. Again, this is a completely new, third empty array now. This empty array gets sent the message + with the argument ['a'], which causes it to return yet another new array which is the array ['a']. This array then gets assigned to h[0].

Lastly, at this point:

h[0] # => ['a']

Now you have finally actually put something into h[0] so, obviously, you get out what you put in.

So, to answer the question you probably had, why don't you get out what you put in? You didn't put anything in in the first place!

If you actually want to assign to the hash inside the block, you have to, well assign to the hash inside the block:

h = Hash.new {|this_hash, nonexistent_key| this_hash[nonexistent_key] = [] }
h[0] << 'a'
h[0] # => ['a']

It's actually fairly easy to see what is going on in your code example, if you look at the identities of the objects involved. Then you can see that everytime you call h[0], you get a different array.



回答2:

The problem in your code is that h[0] << 'a' makes an new Array and gives it out when you index with h[0], but doesn't store the modified Array anywhere after the << 'a' because there is no assignment.

Meanwhile h[0] += ['a'] works because it's equivalent to h[0] = h[0] + ['a']. It's the assignment ([]=) that makes the difference.

The first case may seem confusing, but it is useful when you just want to receive some unchanging default element from a Hash when the key is not found. Otherwise you could end up populating the Hash with a great number of unused values just by indexing it.



回答3:

h = Hash.new{ |a,b| a[b] = Array.new }
h[0] << "hello world"
#=> ["hello world"]
h[0]
#=> ["hello world"]


标签: ruby arrays hash