Difference between generator expression and genera

2020-05-27 11:09发布

Is there any difference — performance or otherwise — between generator expressions and generator functions?

In [1]: def f():
   ...:     yield from range(4)
   ...:

In [2]: def g():
   ...:     return (i for i in range(4))
   ...:

In [3]: f()
Out[3]: <generator object f at 0x109902550>

In [4]: list(f())
Out[4]: [0, 1, 2, 3]

In [5]: list(g())
Out[5]: [0, 1, 2, 3]

In [6]: g()
Out[6]: <generator object <genexpr> at 0x1099056e0>

I'm asking because I want to decide how I should decide between using the two. Sometimes generator functions are clearer and then the choice is clear. I am asking about those times when code clarity does not make one choice obvious.

2条回答
你好瞎i
2楼-- · 2020-05-27 11:37

In addition to @Bakuriu's good point — that generator functions implement send(), throw(), and close() — there is another difference I've run into. Sometimes, you have some setup code that happens before the yield statement is reached. If that setup code can raise exceptions, then the generator-returning version might be preferable to the generator function because it will raise the exception sooner. E.g.,

def f(x):
    if x < 0:
        raise ValueError
    for i in range(4):
        yield i * i

def g(x):
    if x < 0:
        raise ValueError
    return (i * i for i in range(x))

print(list(f(4)))
print(list(g(4)))
f(-1)  # no exception until the iterator is consumed!
g(-1)

If one wants both behaviors, I think the following is best:

def f(count):
    x = 0
    for i in range(count):
        x = yield i + (x or 0)

def protected_f(count):
    if count < 0:
        raise ValueError
    return f(count)

it = protected_f(10)
try:
    print(next(it))
    x = 0
    while True:
        x = it.send(x)
        print(x)
except StopIteration:
    pass

it = protected_f(-1)
查看更多
够拽才男人
3楼-- · 2020-05-27 12:01

The functions you provided have completely different semantics in the general case.

The first one, with yield from, passes the control to the iterable. This means that calls to send() and throw() during the iteration will be handled by the iterable and not by the function you are defining.

The second function only iterates over the elements of the iterable, and it will handle all the calls to send() and throw(). To see the difference check this code:

In [8]: def action():
   ...:     try:
   ...:         for el in range(4):
   ...:             yield el
   ...:     except ValueError:
   ...:         yield -1
   ...:         

In [9]: def f():
   ...:     yield from action()
   ...:     

In [10]: def g():
    ...:     return (el for el in action())
    ...: 

In [11]: x = f()

In [12]: next(x)
Out[12]: 0

In [13]: x.throw(ValueError())
Out[13]: -1

In [14]: next(x)
---------------------------------------------------------------------------
StopIteration                             Traceback (most recent call last)
<ipython-input-14-5e4e57af3a97> in <module>()
----> 1 next(x)

StopIteration: 

In [15]: x = g()

In [16]: next(x)
Out[16]: 0

In [17]: x.throw(ValueError())
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-17-1006c792356f> in <module>()
----> 1 x.throw(ValueError())

<ipython-input-10-f156e9011f2f> in <genexpr>(.0)
      1 def g():
----> 2     return (el for el in action())
      3 

ValueError: 

In fact, due to this reason, yield from probably has a higher overhead than the genexp, even though it is probably irrelevant.

Use yield from only when the above behaviour is what you want or if you are iterating over a simple iterable that is not a generator (so that yield from is equivalent to a loop + simple yields).

Stylistically speaking I'd prefer:

def h():
    for el in range(4):
        yield el

Instead of returning a genexp or using yield from when dealing with generators.

In fact the code used by the generator to perform the iteration is almost identical to the above function:

In [22]: dis.dis((i for i in range(4)).gi_code)
  1           0 LOAD_FAST                0 (.0)
        >>    3 FOR_ITER                11 (to 17)
              6 STORE_FAST               1 (i)
              9 LOAD_FAST                1 (i)
             12 YIELD_VALUE
             13 POP_TOP
             14 JUMP_ABSOLUTE            3
        >>   17 LOAD_CONST               0 (None)
             20 RETURN_VALUE

As you can see it does a FOR_ITER + YIELD_VALUE. note that the argument (.0), is iter(range(4)). The bytecode of the function also contains the calls to LOAD_GLOBAL and GET_ITER that are required to lookup range and obtain its iterable. However this actions must be performed by the genexp too, just not inside its code but before calling it.

查看更多
登录 后发表回答