This question already has an answer here:
If I do something with list comprehensions, it writes to a local variable:
i = 0
test = any([i == 2 for i in xrange(10)])
print i
This prints "9". However, if I use a generator, it doesn't write to a local variable:
i = 0
test = any(i == 2 for i in xrange(10))
print i
This prints "0".
Is there any good reason for this difference? Is this a design decision, or just a random byproduct of the way that generators and list comprehensions are implemented? Personally, it would seem better to me if list comprehensions didn't write to local variables.
One of the subtle consequences of the dirty secret described by poke above, is that
list(...)
and[...]
does not have the same side-effects in Python 2:So no side-effect for generator expression inside list-constructor, but the side-effect is there in a direct list-comprehension:
As a by-product of wandering how list-comprehensions are actually implemented, I found out a good answer for your question.
In Python 2, take a look at the byte-code generated for a simple list comprehension:
it essentially translates to a simple
for-loop
, that's the syntactic sugar for it. As a result, the same semantics as forfor-loops
apply:In the list-comprehension case, (C)Python uses a "hidden list name" and a special instruction
LIST_APPEND
to handle creation but really does nothing more than that.So your question should generalize to why Python writes to the for loop variable in
for-loop
s; that is nicely answered by a blog post from Eli Bendersky.Python 3, as mentioned and by others, has changed the list-comprehension semantics to better match that of generators (by creating a separate code-object for the comprehension) and is essentially syntactic sugar for the following:
this won't leak because it doesn't run in the uppermost scope as the Python 2 equivalent does. The
i
is leaked, only in__f
and then destroyed as a local variable to that function.If you'd want, take a look at the byte-code generated for Python 3 by running
dis('a = [i for i in [1, 2, 3]]')
. You'll see how a "hidden" code-object is loaded and then a function call is made in the end.Because... because.
No, really, that's it. Quirk of the implementation. And arguably a bug, since it's fixed in Python 3.
You are correct. This is fixed in Python 3.x. The behavior is unchanged in 2.x so that it doesn't impact existing code that (ab)uses this hole.
As PEP 289 (Generator Expressions) explains:
It appears to have been done for implementation reasons.
PEP 289 clarifies this as well:
In other words, the behaviour you describe indeed differs in Python 2 but it has been fixed in Python 3.
Python’s creator, Guido van Rossum, mentions this when he wrote about generator expressions that were uniformly built into Python 3: (emphasis mine)
So in Python 3 you won’t see this happen anymore.
Interestingly, dict comprehensions in Python 2 don’t do this either; this is mostly because dict comprehensions were backported from Python 3 and as such already had that fix in them.
There are some other questions that cover this topic too, but I’m sure you have already seen those when you searched for the topic, right? ;)