Why does python asyncio loop.call_soon overwrite d

2019-08-27 04:23发布

问题:

I created a hard to track down bug in our code, but do not understand why it occurs. The problem occurs when pushing the same async function multiple times to call soon. It does not happen with synchronous functions.

Here is a running example of the issue:

import asyncio
import sys

class TestObj(object):

    def __init__(self):

        self.test_data = {'a': 1, 'b': 2, 'c': 3}
        self.loop = asyncio.get_event_loop()
        self.loop.call_later(1, lambda: asyncio.ensure_future(self.calling_func()))
        self.loop.call_later(2, self.calling_func_sync)
        self.loop.call_later(4, sys.exit)
        self.loop.run_forever()

    async def do_something(self, k, v):
        print("Values", k, v)

    async def calling_func(self):
        for k, v in self.test_data.items():
            print("Sending", k, v)
            self.loop.call_soon(lambda: asyncio.ensure_future(self.do_something(k, v)))

    def do_something_sync(self, k, v):
        print("Values_sync", k, v)

    def calling_func_sync(self):
        for k, v in self.test_data.items():
            print("Sending_sync", k, v)
            self.loop.call_soon(self.do_something_sync, k, v)


if __name__ == "__main__":
    a = TestObj()

The output is:

Sending a 1
Sending b 2
Sending c 3
Values c 3
Values c 3
Values c 3
Sending_sync a 1
Sending_sync b 2
Sending_sync c 3
Values_sync a 1
Values_sync b 2
Values_sync c 3

Why is this happening and why? Only the asynchronous function is being stomped on. I would have expected that every call to call_soon pushes a new pointer onto the stack, but it seems there is a pointer to self.do_something that is getting overwritten.

回答1:

This has nothing to do with the async code, but with the lambda you're creating in your loop. When you write lambda: asyncio.ensure_future(self.do_something(k, v)), you're creating a closure that accesses the variables k and v from the enclosing namespace (and self too, but that's not a problem). When the lambda function is called, it will use the values bound by those names in the outer scope at that time of the call, not the values they had when the lambda was defined. Since k and v change value on each iteration of the loop, that's causing all the lambda functions to see the same values (the last ones).

A common way to avoid this issue is to make the current values of the variables default values for arguments to the lambda function:

self.loop.call_soon(lambda k=k, v=v: asyncio.ensure_future(self.do_something(k, v)))


回答2:

Your problem actually has nothing to do with asyncio. The k and v in lambda: asyncio.ensure_future(self.do_something(k, v)) still refer to the variables in your outer scope. Their values change by the time you call your function:

i = 1
f = lambda: print(i)

f()  # 1
i = 2
f()  # 2

A common solution is to define your function and (ab)use default arguments to create a variable local to your function that holds the value of i at the time the function was created, not called:

i = 1
f = lambda i=i: print(i)

f()  # 1
i = 2
f()  # 1

You can use f = lambda x=i: print(x) if the naming confuses you.



回答3:

In addition to correct explanations by others concerning the error in the lambda, also note that you don't even need the lambda. Since do_something is a coroutine, just calling it will not execute any of its code until the next iteration of the event loop, so you automatically have the effect of a call_soon. (This is analogous to how calling a generator function doesn't start executing it until you start exhausing the returned iterator.)

In other words, you can replace

self.loop.call_soon(lambda: asyncio.ensure_future(self.do_something(k, v)))

with the simpler

self.loop.create_task(self.do_something(k, v))

create_task is preferable to ensure_future when you are dealing with a coroutine.