I have an asynchronous application that is serving requests via aiohttp and doing other async tasks (interacting with a database, handling messages, making requests itself as an HTTP client). I'd like to monitor how busy the event loop is, perhaps how much time it's spending executing code versus waiting for a select
to complete.
Is there a way to measure this with standard library event loops or other third-party loops (uvloop)?
Specifically, I'd like an ongoing percentage measure of saturation, not just a binary "is it busy" that this question seems to address.
Digging source code shows following:
- Event loop is basically executing
_run_once
in while True
loop
_run_once
does all stuff including waiting for a select
to complete
timeout
to wait for select
isn't stored anywhere outside _run_once
Nothing stops us to reimplement _run_once
to time we need.
Instead of coping full _run_once
implementation, we can assume time right before select
is when _run_once
started (because above select
nothing time consuming happens) and time right after select
is when _process_events
started.
From words to action:
import asyncio
class MeasuredEventLoop(asyncio.SelectorEventLoop):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
self._total_time = 0
self._select_time = 0
self._before_select = None
# TOTAL TIME:
def run_forever(self):
started = self.time()
try:
super().run_forever()
finally:
finished = self.time()
self._total_time = finished - started
# SELECT TIME:
def _run_once(self):
self._before_select = self.time()
super()._run_once()
def _process_events(self, *args, **kwargs):
after_select = self.time()
self._select_time += after_select - self._before_select
super()._process_events(*args, **kwargs)
# REPORT:
def close(self, *args, **kwargs):
super().close(*args, **kwargs)
select = self._select_time
cpu = self._total_time - self._select_time
total = self._total_time
print(f'Waited for select: {select:.{3}f}')
print(f'Did other stuff: {cpu:.{3}f}')
print(f'Total time: {total:.{3}f}')
Let's test it:
import time
async def main():
await asyncio.sleep(1) # simulate I/O, will be handled by selectors
time.sleep(0.01) # CPU job, executed here, outside event loop
await asyncio.sleep(1)
time.sleep(0.01)
loop = MeasuredEventLoop()
asyncio.set_event_loop(loop)
try:
loop.run_until_complete(main())
finally:
loop.close()
Result:
Waited for select: 2.000
Did other stuff: 0.032
Total time: 2.032
Let's test it against code with real I/O:
import aiohttp
async def io_operation(delay):
async with aiohttp.ClientSession() as session:
async with session.get(f'http://httpbin.org/delay/{delay}') as resp:
await resp.text()
async def main():
await asyncio.gather(*[
io_operation(delay=1),
io_operation(delay=2),
io_operation(delay=3),
])
Result:
Waited for select: 3.250
Did other stuff: 0.016
Total time: 3.266
One way of measuring how much time the CPU was busy (applicable for all sorts of IO-bound situations, not just async) is to periodically check time.process_time()
.
For example, here's how I would implement something which occasionally prints the percentage of time that the loop was "busy" because the process was running code on the CPU:
import asyncio
import time
async def monitor():
while True:
before = time.process_time()
await asyncio.sleep(10)
after = time.process_time()
print('{:.2%}'.format((after - before) / 10))
async def busy_wait():
while True:
await asyncio.sleep(.001)
loop = asyncio.get_event_loop()
loop.create_task(monitor())
loop.create_task(busy_wait())
loop.run_forever()
The monitor
coroutine measures the amount of process time elapsed every 10 seconds, and prints it as a percentage of the full 10 seconds.
The busy_wait
coroutine generates artificial load on the CPU by repeatedly sleeping for short periods. By increasing the length of the sleep you can make the process time ratio decrease arbitrarily, and by decreasing the length of the sleep to 0 you can make the process time approach 100% of the wall-clock time.
One caveat: this will only tell you how busy the python event loop is in terms of time spent "running python code" (in the CPU-bound sense). If someone is blocking the event loop with calls to time.sleep()
(rather than asyncio.sleep()
), my method will show the loop as being free but it is really "busy" in the sense that it is blocked by a system-level sleep. Most correctly written async code shouldn't call time.sleep()
or do blocking IO, but if it does, it will not be reflected correctly by this type of monitoring.