There are, as far as I know, three ways to create a generator through a comprehension1.
The classical one:
def f1():
g = (i for i in range(10))
The yield
variant:
def f2():
g = [(yield i) for i in range(10)]
The yield from
variant (that raises a SyntaxError
except inside of a function):
def f3():
g = [(yield from range(10))]
The three variants lead to different bytecode, which is not really surprising. It would seem logical that the first one is the best, since it's a dedicated, straightforward syntax to create a generator through comprehension. However, it is not the one that produces the shortest bytecode.
Disassembled in Python 3.6
Classical generator comprehension
>>> dis.dis(f1)
4 0 LOAD_CONST 1 (<code object <genexpr> at...>)
2 LOAD_CONST 2 ('f1.<locals>.<genexpr>')
4 MAKE_FUNCTION 0
6 LOAD_GLOBAL 0 (range)
8 LOAD_CONST 3 (10)
10 CALL_FUNCTION 1
12 GET_ITER
14 CALL_FUNCTION 1
16 STORE_FAST 0 (g)
5 18 LOAD_FAST 0 (g)
20 RETURN_VALUE
yield
variant
>>> dis.dis(f2)
8 0 LOAD_CONST 1 (<code object <listcomp> at...>)
2 LOAD_CONST 2 ('f2.<locals>.<listcomp>')
4 MAKE_FUNCTION 0
6 LOAD_GLOBAL 0 (range)
8 LOAD_CONST 3 (10)
10 CALL_FUNCTION 1
12 GET_ITER
14 CALL_FUNCTION 1
16 STORE_FAST 0 (g)
9 18 LOAD_FAST 0 (g)
20 RETURN_VALUE
yield from
variant
>>> dis.dis(f3)
12 0 LOAD_GLOBAL 0 (range)
2 LOAD_CONST 1 (10)
4 CALL_FUNCTION 1
6 GET_YIELD_FROM_ITER
8 LOAD_CONST 0 (None)
10 YIELD_FROM
12 BUILD_LIST 1
14 STORE_FAST 0 (g)
13 16 LOAD_FAST 0 (g)
18 RETURN_VALUE
In addition, a timeit
comparison shows that the yield from
variant is the fastest (still run with Python 3.6):
>>> timeit(f1)
0.5334039637357152
>>> timeit(f2)
0.5358906506760719
>>> timeit(f3)
0.19329123352712596
f3
is more or less 2.7 times as fast as f1
and f2
.
As Leon mentioned in a comment, the efficiency of a generator is best measured by the speed it can be iterated over. So I changed the three functions so they iterate over the generators, and call a dummy function.
def f():
pass
def fn():
g = ...
for _ in g:
f()
The results are even more blatant:
>>> timeit(f1)
1.6017412817975778
>>> timeit(f2)
1.778684261368946
>>> timeit(f3)
0.1960603619517669
f3
is now 8.4 times as fast as f1
, and 9.3 times as fast as f2
.
Note: The results are more or less the same when the iterable is not range(10)
but a static iterable, such as [0, 1, 2, 3, 4, 5]
.
Therefore, the difference of speed has nothing to do with range
being somehow optimized.
So, what are the differences between the three ways?
More specifically, what is the difference between the yield from
variant and the two other?
Is this normal behaviour that the natural construct (elt for elt in it)
is slower than the tricky [(yield from it)]
?
Shall I from now on replace the former by the latter in all of my scripts, or is there any drawbacks to using the yield from
construct?
Edit
This is all related, so I don't feel like opening a new question, but this is getting even stranger.
I tried comparing range(10)
and [(yield from range(10))]
.
def f1():
for i in range(10):
print(i)
def f2():
for i in [(yield from range(10))]:
print(i)
>>> timeit(f1, number=100000)
26.715589237537195
>>> timeit(f2, number=100000)
0.019948781941049987
So. Now, iterating over [(yield from range(10))]
is 186 times as fast as iterating over a bare range(10)
?
How do you explain why iterating over [(yield from range(10))]
is so much faster than iterating over range(10)
?
1: For the sceptical, the three expressions that follow do produce a generator
object; try and call type
on them.
This is what you should be doing:
It's a generator expression. It's equivalent to
but if you just wanted an iterable with the elements of
range(10)
, you could have doneYou do not need to wrap any of this in a function.
If you're here to learn what code to write, you can stop reading. The rest of this post is a long and technical explanation of why the other code snippets are broken and should not be used, including an explanation of why your timings are broken too.
This:
is a broken construct that should have been taken out years ago. 8 years after the problem was originally reported, the process to remove it is finally beginning. Don't do it.
While it's still in the language, on Python 3, it's equivalent to
List comprehensions are supposed to return lists, but because of the
yield
, this one doesn't. It acts kind of like a generator expression, and it yields the same things as your first snippet, but it builds an unnecessary list and attaches it to theStopIteration
raised at the end.This is confusing and a waste of memory. Don't do it. (If you want to know where all those
None
s are coming from, read PEP 342.)On Python 2,
g = [(yield i) for i in range(10)]
does something entirely different. Python 2 doesn't give list comprehensions their own scope - specifically list comprehensions, not dict or set comprehensions - so theyield
is executed by whatever function contains this line. On Python 2, this:is equivalent to
making
f
a generator-based coroutine, in the pre-async sense. Again, if your goal was to get a generator, you've wasted a bunch of time building a pointless list.This:
is silly, but none of the blame is on Python this time.
There is no comprehension or genexp here at all. The brackets are not a list comprehension; all the work is done by
yield from
, and then you build a 1-element list containing the (useless) return value ofyield from
. Yourf3
:when stripped of the unnecessary list-building, simplifies to
or, ignoring all the coroutine support stuff
yield from
does,Your timings are also broken.
In your first timing,
f1
andf2
create generator objects that can be used inside those functions, thoughf2
's generator is weird.f3
doesn't do that;f3
is a generator function.f3
's body does not run in your timings, and if it did, itsg
would behave quite unlike the other functions'g
s. A timing that would actually be comparable withf1
andf2
would beIn your second timing,
f2
doesn't actually run, for the same reasonf3
was broken in the previous timing. In your second timing,f2
is not iterating over a generator. Instead, theyield from
turnsf2
into a generator function itself.This construct accumulates the data that is/may be passed back into the generator through its
send()
method and returns it via theStopIteration
exception when the iteration is exhausted1:No such thing happens with plain generator comprehension:
As for the
yield from
version - in Python 3.5 (which I am using) it doesn't work outside functions, so the illustration is a little different:OK,
send()
doesn't work for a generatoryield
ingfrom
range()
but let's at least see what's at the end of the iteration:1 Note that even if you don't use the
send()
method,send(None)
is assumed, therefore a generator constructed in this way always uses more memory than plain generator comprehension (since it has to accumulate the results of theyield
expression till the end of the iteration):UPDATE
Regarding the performance differences between the three variants.
yield from
beats the other two because it eliminates a level of indirection (which, to the best of my understanding, is one of the two main reasons whyyield from
was introduced). However, in this particular exampleyield from
itself is superfluous -g = [(yield from range(10))]
is actually almost identical tog = range(10)
.This might not do what you think it does.
Call it:
Because the
yield from
is not within a comprehension, it is bound to thef2
function instead of an implicit function, turningf2
into a generator function.I remembered seeing someone point out that it was not actually iterating, but I can't remember where I saw that. I was testing the code myself when I rediscovered this. I didn't find the source searching through the mailing list post nor the bug tracker thread. If someone finds the source, please tell me or add it to the post itself, so it can be credited.