In Python 3, is a list comprehension simply syntactic sugar for a generator expression fed into the list
function?
e.g. is the following code:
squares = [x**2 for x in range(1000)]
actually converted in the background into the following?
squares = list(x**2 for x in range(1000))
I know the output is identical, and Python 3 fixes the surprising side-effects to surrounding namespaces that list comprehensions had, but in terms of what the CPython interpreter does under the hood, is the former converted to the latter, or are there any difference in how the code gets executed?
Background
I found this claim of equivalence in the comments section to this question, and a quick google search showed the same claim being made here.
There was also some mention of this in the What's New in Python 3.0 docs, but the wording is somewhat vague:
Also note that list comprehensions have different semantics: they are closer to syntactic sugar for a generator expression inside a list() constructor, and in particular the loop control variables are no longer leaked into the surrounding scope.
You can actually show that the two can have different outcomes to prove they are inherently different:
The expression inside the comprehension is not treated as a generator since the comprehension does not handle the
StopIteration
, whereas thelist
constructor does.Both forms create and call an anonymous function. However, the
list(...)
form creates a generator function and passes the returned generator-iterator tolist
, while with the[...]
form, the anonymous function builds the list directly withLIST_APPEND
opcodes.The following code gets decompilation output of the anonymous functions for an example comprehension and its corresponding genexp-passed-to-
list
:The output for the comprehension is
The output for the genexp is
Both work differently, the list comprehension version takes the advantage of special bytecode
LIST_APPEND
which calls PyList_Append directly for us. Hence it avoids attribute lookup tolist.append
and function call at Python level.On the other hand the
list()
version simply passes the generator object to list's__init__
method which then calls itsextend
method internally. As the object is not a list or tuple CPython then gets its iterator first and then simply adds the items to the list until the iterator is exhausted:Timing comparisons:
Normal loops are slightly slow due to slow attribute lookup. Cache it and time again.
Apart from the fact that list comprehension don't leak the variables anymore one more difference is that something like this is not valid anymore:
They aren't the same,
list()
will evaluate what ever is given to it after what is in the parentheses has finished executing, not before.The
[]
in python is a bit magical, it tells python to wrap what ever is inside it as a list, more like a type hint for the language.