I get different output when using a list comprehension versus a generator comprehension. Is this expected behavior or a bug?
Consider the following setup:
all_configs = [
{'a': 1, 'b':3},
{'a': 2, 'b':2}
]
unique_keys = ['a','b']
If I then run the following code, I get:
print(list(zip(*( [c[k] for k in unique_keys] for c in all_configs))))
>>> [(1, 2), (3, 2)]
# note the ( vs [
print(list(zip(*( (c[k] for k in unique_keys) for c in all_configs))))
>>> [(2, 2), (2, 2)]
This is on python 3.6.0:
Python 3.6.0 (default, Dec 24 2016, 08:01:42)
[GCC 4.2.1 Compatible Apple LLVM 8.0.0 (clang-800.0.42.1)] on darwin
Both are generator object. The first one is just a generator and the second a generator in a generator
When you use zip(* in the first expression nothing happens because it is one generator that will return the list same as list() would do. So it returns the output you would expect. The second time it zips the generators creating a list with the first generator and a list with the second generator. Those generators on there own have a differnt result then the generator of the first expression.
This would be the list compression:
To see what's going on, replace
c[k]
with a function with a side effect:output:
c
in generator expressions is evaluated after the outer loop has completed:c
bears the last value it took in the outer loop.In the list comprehension case,
c
is evaluated at once.(note that
aabb
vsabab
too because of execution when zipping vs execution at once)note that you can keep the "generator" way of doing it (not creating the temporary list) by passing
c
tomap
so the current value is stored:in Python 3,
map
does not create alist
, but the result is still OK:[(1, 2), (3, 2)]
This is happening because
zip(*)
call resulted in evaluation of the outer generator and this outer returned two more generators.The evaluation of outer generator moved
c
to the second dict:{'a': 2, 'b':2}
.Now when we are evaluating these generators individually they look for
c
somewhere, and as its value is now{'a': 2, 'b':2}
you get the output as[(2, 2), (2, 2)]
.Demo:
Output:
The list-comprehension on the other hand evaluates right away and can fetch the value of current value of
c
not its last value.How to force it use the correct value of
c
?Use a inner function and generator function. The inner function can help us remember
c
's value using default argument.In a list comprehension, expressions are evaluated eagerly. In a generator expression, they are only looked up as needed.
Thus, as the generator expression iterates over
for c in all_configs
, it refers toc[k]
but only looks upc
after the loop is done, so it only uses the latest value for both tuples. By contrast, the list comprehension is evaluated immediately, so it creates a tuple with the first value ofc
and another tuple with the second value ofc
.Consider this small example:
When creating
a
, the interpreter created that list immediately, looking up the value ofi
as soon as it was evaluated. When creatingb
, the interpreter just set up that generator and didn't actually iterate over it and look up the value ofi
. Theprint
calls told the interpreter to evaluate those objects.a
already existed as a full list in memory with the old value ofi
, butb
was evaluated at that point, and when it looked up the value ofi
, it found the new value.