Let's say we have a dummy function:
async def foo(arg):
result = await some_remote_call(arg)
return result.upper()
What's the difference between:
coros = []
for i in range(5):
coros.append(foo(i))
loop = get_event_loop()
loop.run_until_complete(wait(coros))
And:
from asyncio import ensure_future
futures = []
for i in range(5):
futures.append(ensure_future(foo(i)))
loop = get_event_loop()
loop.run_until_complete(wait(futures))
Note: The example returns a result, but this isn't the focus of the question. When return value matters, use gather()
instead of wait()
.
Regardless of return value, I'm looking for clarity on ensure_future()
. wait(coros)
and wait(futures)
both run the coroutines, so when and why should a coroutine be wrapped in ensure_future
?
Basically, what's the Right Way (tm) to run a bunch of non-blocking operations using Python 3.5's async
?
For extra credit, what if I want to batch the calls? For example, I need to call some_remote_call(...)
1000 times, but I don't want to crush the web server/database/etc with 1000 simultaneous connections. This is doable with a thread or process pool, but is there a way to do this with asyncio
?
A coroutine is a generator function that can both yield values and accept values from the outside. The benefit of using a coroutine is that we can pause the execution of a function and resume it later. In case of a network operation, it makes sense to pause the execution of a function while we're waiting for the response. We can use the time to run some other functions.
A future is like the
Promise
objects from Javascript. It is like a placeholder for a value that will be materialized in the future. In the above-mentioned case, while waiting on network I/O, a function can give us a container, a promise that it will fill the container with the value when the operation completes. We hold on to the future object and when it's fulfilled, we can call a method on it to retrieve the actual result.Direct Answer: You don't need
ensure_future
if you don't need the results. They are good if you need the results or retrieve exceptions occurred.Extra Credits: I would choose
run_in_executor
and pass anExecutor
instance to control the number of max workers.Explanations and Sample codes
In the first example, you are using coroutines. The
wait
function takes a bunch of coroutines and combines them together. Sowait()
finishes when all the coroutines are exhausted (completed/finished returning all the values).The
run_until_complete
method would make sure that the loop is alive until the execution is finished. Please notice how you are not getting the results of the async execution in this case.In the second example, you are using the
ensure_future
function to wrap a coroutine and return aTask
object which is a kind ofFuture
. The coroutine is scheduled to be executed in the main event loop when you callensure_future
. The returned future/task object doesn't yet have a value but over time, when the network operations finish, the future object will hold the result of the operation.So in this example, we're doing the same thing except we're using futures instead of just using coroutines.
Let's look at an example of how to use asyncio/coroutines/futures:
Here, we have used the
create_task
method on theloop
object.ensure_future
would schedule the task in the main event loop. This method enables us to schedule a coroutine on a loop we choose.We also see the concept of adding a callback using the
add_done_callback
method on the task object.A
Task
isdone
when the coroutine returns a value, raises an exception or gets canceled. There are methods to check these incidents.I have written some blog posts on these topics which might help:
Of course, you can find more details on the official manual: https://docs.python.org/3/library/asyncio.html
From the BDFL [2013]
Tasks
With this in mind,
ensure_future
makes sense as a name for creating a Task since the Future's result will be computed whether or not you await it (as long as you await something). This allows the event loop to complete your Task while you're waiting on other things. Note that in Python 3.7create_task
is the preferred way ensure a future.Note: I changed "yield from" in Guido's slides to "await" here for modernity.
A comment by Vincent linked to https://github.com/python/asyncio/blob/master/asyncio/tasks.py#L346, which shows that
wait()
wraps the coroutines inensure_future()
for you!In other words, we do need a future, and coroutines will be silently transformed into them.
I'll update this answer when I find a definitive explanation of how to batch coroutines/futures.
Simple answer is
async def
) does NOT run it. it returns just coroutine objects, like generator function returns generator objects.await
retrieves values from coroutines, i.e. calls the coroutineeusure_future/create_task
schedule the coroutine to run on the event loop on next iteration(although not waiting them to finish, like a daemon thread).Some code examples
Let's first clear some terms:
async def
sseem comments below.
Case 1,
await
on a coroutineWe create two coroutines,
await
one, and use create_task run the other one.you will get result:
Explain:
task1 was executed directly, and task2 was executed in the following iteration.
Case 2, yielding control to event loop
If we replace the main function, we can see a different result:
you will get result:
Explain:
When calling
asyncio.sleep(1)
, the control was yielded back to the event loop, and the loop checks for tasks to run, then it runs the task created bycreate_task
.Note that, we first invoke the corotine function, but not
await
it, so we just created a single corotine, and not make it running. Then, we call the corotine function again, and wrap it in acreate_task
call, creat_task will actully schedule the coroutine to run on next iteration. So, in the result,create task
is executed beforeawait
.Actually, the point here is to give back control to the loop, you could use
asyncio.sleep(0)
to see the same result.Under the hood
loop.create_task
actually callsasyncio.tasks.Task()
, which will callloop.call_soon
. Andloop.call_soon
will put the task inloop._ready
. During each iteration of the loop, it checks for every callbacks in loop._ready and runs it.asyncio.wait
,asyncio.eusure_future
andasyncio.gather
actully callloop.create_task
directly or indirectly.Also note in the docs: