Update: Edited title to focus on the main problem. See my answer for full update.
In the following code, a()
and b()
are identical. Each of them counts from 0 to 9 concurrently while acquiring and yielding a lock every 2 counts.
import asyncio
lock = asyncio.Lock()
def a ():
yield from lock.acquire()
for i in range(10):
print('a: ' + str(i))
if i % 2 == 0:
lock.release()
yield from lock.acquire()
lock.release()
def b ():
yield from lock.acquire()
for i in range(10):
print('b: ' + str(i))
if i % 2 == 0:
lock.release()
yield from lock.acquire()
lock.release()
asyncio.get_event_loop().run_until_complete(asyncio.gather(a(), b()))
print('done')
I expected interleaved output, but instead I get:
b: 0
b: 1
b: 2
b: 3
b: 4
b: 5
b: 6
b: 7
b: 8
b: 9
a: 0
a: 1
a: 2
a: 3
a: 4
a: 5
a: 6
a: 7
a: 8
a: 9
done
It seems that the second yield
does not actually yield but instead reacquires the lock immediately and continues.
That seems like a bug to me. Am I right? or is there another explanation?
The following code, modified with an extra initial "noop" yield
, works fine as expected. That makes me believe that the lock is indeed fair and probably correct.
import asyncio
lock = asyncio.Lock()
def a ():
yield from lock.acquire()
yield from asyncio.sleep(0)
for i in range(10):
print('a: ' + str(i))
if i % 2 == 0:
lock.release()
yield from lock.acquire()
lock.release()
def b ():
yield from lock.acquire()
yield from asyncio.sleep(0)
for i in range(10):
print('b: ' + str(i))
if i % 2 == 0:
lock.release()
yield from lock.acquire()
lock.release()
asyncio.get_event_loop().run_until_complete(asyncio.gather(a(), b()))
print('done')
Output:
a: 0
b: 0
a: 1
a: 2
b: 1
b: 2
a: 3
a: 4
b: 3
b: 4
a: 5
a: 6
b: 5
b: 6
a: 7
a: 8
b: 7
b: 8
a: 9
b: 9
done
Note that I am doing the no-op yield just once at the beginning, and not every 2 counts. Yet doing so causes the interleaving every 2 counts as expected in the first piece of code.
There is just some optimization (a bug in my opinion) in the scheduler that doesn't really yield
when acquiring a lock that nobody else is waiting for.
How else to explain the first output?
Update: The following is obsolete in light of my comment (link) on the github issue. The comment observes that you can use Lock.locked()
to predict whether Lock.acquire()
will yield or not. It also observes that numerous other coroutines do not yield in the fast case, so it is a lost cause to even consider fixing all of them. Finally it relates this how a different issue was fixed and suggests that it could have been better fixed. That was the request for a asyncio.nop()
method that would merely yield to the scheduler and do nothing else. Instead of adding that method, they decided to overload asyncio.sleep(0)
and "de-optimize" it (in the context of this dicussion of lock.acquire()
) to yield to the scheduler when the argument is 0.
Original answer below but superseded by above paragraph:
The root cause that the implementation of asyncio.lock tries to be too smart in its first three lines and does not give control back to the scheduler if there are no waiters:
if not self._locked and all(w.cancelled() for w in self._waiters):
self._locked = True
return True
However, as my first example shows, this prevents other coroutines from even becoming a waiter. They just don't have an opportunity to run to the point they try to acquire a lock.
The inefficent workaround is to always yield from asyncio.sleep(0)
immediately before acquiring a lock.
This is inefficient because in the common case there will be other waiters and acquiring a lock will also yield control back to the scheduler. Therefore in most cases you will be yielding control back to the scheduler twice, which sucks.
Also note that the documentation for lock ambiguously says: "This method blocks until the lock is unlocked, then sets it to locked and returns True." Certainly gives the impression that it will yield control to the scheduler before acquiring the lock.
In my opinion, the right thing to do is for the lock implementation to always yield and not be too smart. Alternatively, the lock implementation should have a method that tells you whether or not it will yield if acquired so that your code can manually yield if the lock acquisition won't. Another alternative is to have the acquire()
call return a value that tells you whether or not it actually yielded. This is less preferable but still better than the status quo.
One might think a better workaround might be to manually yield at the time of release()
. However if you look at a tight loop that releases and reacquires after pieces of work, then it amounts to the same thing -- in the common case it will still yield twice, once at release time, again at acquire time, adding inefficiency.
It is unclear what you are trying to achieve, but it seems like Lock
is not the tool you need. To interleave python code, you can do as simple as:
def foo(tag, n):
for i in range(n):
print("inside", tag, i)
yield (tag, i)
print('start')
for x in zip(foo('A', 10), foo('B', 10)):
print(x)
print('done')
No need for asyncio
or threading
. Anyway, asyncio
without IO does not make a lot of sense.
threading
.Lock
is used to synchronize critical parts of a program which otherwise run in independent threads. asyncio.Lock
would allow other coroutines to continue with IO while one coroutine waits:
import asyncio
import random
lock = asyncio.Lock()
async def foo(tag):
print(tag, "Start")
for i in range(10):
print(tag, '>', i)
await asyncio.sleep(random.uniform(0.1, 1))
print(tag, '<', i)
async with lock:
# only one coroutine can execute the critical section at once.
# other coroutines can still use IO.
print(tag, "CRITICAL START")
await asyncio.sleep(1)
print(tag, "STILL IN CRITICAL")
await asyncio.sleep(1)
print(tag, "CRITICAL END")
for i in range(10, 20):
print(tag, '>', i)
await asyncio.sleep(random.uniform(0.1, 1))
print(tag, '<', i)
print(tag, "Done")
print('start')
loop = asyncio.get_event_loop()
tasks = asyncio.gather(foo('A'), foo('B'), foo('C'))
loop.run_until_complete(tasks)
loop.close()
print('done')
Keep in mind that the keyword yield
does not always obey the English meaning of yield :-) .
You can see that it makes more sense that async with lock
will acquire the lock immediately, without awaiting for other coroutines to do more work: the first couroutine to reach the critical part should start running it. (i.e., adding an await asyncio.sleep(0)
before async with lock:
just does not make any sense.)