Yield from coroutine vs yield from task

2019-02-01 08:14发布

问题:

Guido van Rossum, in his speech in 2014 on Tulip/Asyncio shows the slide:

Tasks vs coroutines

  • Compare:

    • res = yield from some_coroutine(...)
    • res = yield from Task(some_coroutine(...))
  • Task can make progress without waiting for it

    • As log as you wait for something else
      • i.e. yield from

And I'm completely missing the point.

From my point of view both constructs are identical:

In case of bare coroutine - It gets scheduled, so the task is created anyways, because scheduler operates with Tasks, then coroutine caller coroutine is suspended until callee is done and then becomes free to continue execution.

In case of Task - All the same - new task is schduled and caller coroutine waits for its completion.

What is the difference in the way that code executed in both cases and what impact it has that developer should consider in practice?

p.s.
Links to authoritative sources (GvR, PEPs, docs, core devs notes) will be very appreciated.

回答1:

For the calling side co-routine yield from coroutine() feels like a function call (i.e. it will again gain control when coroutine() finishes).

yield from Task(coroutine()) on the other hand feels more like creating a new thread. Task() returns almost instantly and very likely the caller gains control back before the coroutine() finishes.

The difference between f() and th = threading.Thread(target=f, args=()); th.start(); th.join() is obvious, right?



回答2:

The point of using asyncio.Task(coro()) is for cases where you don't want to explicitly wait for coro, but you want coro to be executed in the background while you wait for other tasks. That is what Guido's slide means by

[A] Task can make progress without waiting for it...as long as you wait for something else

Consider this example:

import asyncio

@asyncio.coroutine
def test1():
    print("in test1")


@asyncio.coroutine
def dummy():
    yield from asyncio.sleep(1)
    print("dummy ran")


@asyncio.coroutine
def main():
    test1()
    yield from dummy()

loop = asyncio.get_event_loop()
loop.run_until_complete(main())

Output:

dummy ran

As you can see, test1 was never actually executed, because we didn't explicitly call yield from on it.

Now, if we use asyncio.async to wrap a Task instance around test1, the result is different:

import asyncio

@asyncio.coroutine
def test1():
    print("in test1")


@asyncio.coroutine
def dummy():
    yield from asyncio.sleep(1)
    print("dummy ran")


@asyncio.coroutine
def main():
    asyncio.async(test1())
    yield from dummy()

loop = asyncio.get_event_loop()
loop.run_until_complete(main())

Output:

in test1
dummy ran

So, there's really no practical reason for using yield from asyncio.async(coro()), since it's slower than yield from coro() without any benefit; it introduces the overhead of adding coro to the internal asyncio scheduler, but that's not needed, since using yield from guarantees that coro is going to execute, anyway. If you just want to call a coroutine and wait for it to finish, just yield from the coroutine directly.

Side note:

I'm using asyncio.async* instead of Task directly because the docs recommend it:

Don’t directly create Task instances: use the async() function or the BaseEventLoop.create_task() method.

* Note that as of Python 3.4.4, asyncio.async is deprecated in favor of asyncio.ensure_future.



回答3:

As described in PEP 380, the accepted PEP document that introduced yield from, the expression res = yield from f() comes from the idea of the following loop:

for res in f():
    yield res

With this, things become very clear: if f() is some_coroutine(), then the coroutine is executed. On the other hand, if f() is Task(some_coroutine()), Task.__init__ is executed instead. some_coroutine() is not executed, only the newly created generator is passed as the first argument to Task.__init__.

Conclusion:

  • res = yield from some_coroutine() => coroutine continues execution and returns the next value
  • res = yield from Task(some_coroutine()) => a new task is created, which stores a non-executed some_coroutine() generator object.