Python asyncio subprocess write stdin and read std

2020-07-13 09:20发布

问题:

I'm currently on a task with subprocess in python3 asyncio. My code is simply write to stdin and read stdout/stderr simultaneously:

import asyncio


async def read_stdout(stdout):
    print('read_stdout')
    while True:
        buf = await stdout.read(10)
        if not buf:
            break

        print(f'stdout: { buf }')


async def read_stderr(stderr):
    print('read_stderr')
    while True:
        buf = await stderr.read()
        if not buf:
            break

        print(f'stderr: { buf }')


async def write_stdin(stdin):
    print('write_stdin')
    for i in range(100):
        buf = f'line: { i }\n'.encode()
        print(f'stdin: { buf }')

        stdin.write(buf)
        await stdin.drain()
        await asyncio.sleep(0.5)


async def run():
    proc = await asyncio.create_subprocess_exec(
        '/usr/bin/tee',
        stdin=asyncio.subprocess.PIPE,
        stdout=asyncio.subprocess.PIPE,
        stderr=asyncio.subprocess.PIPE)

    await asyncio.gather(
        read_stderr(proc.stderr),
        read_stdout(proc.stdout),
        write_stdin(proc.stdin))


asyncio.run(run())

It works pretty well but I see a warning on Python3 document page:

Warning Use the communicate() method rather than process.stdin.write(), await process.stdout.read() or await process.stderr.read. This avoids deadlocks due to streams pausing reading or writing and blocking the child process.

Does that mean the above code will fall into deadlock in some scenarios? If so how to write stdin and read stdout/stderr continuously in python3 asyncio without deadlock?

Thank you very much.

回答1:

The warning was carried over from the regular subprocess module, and warns against naive code that tries to implement simple communication that appears perfectly correct, such as:

# write the request to the subprocess
await proc.stdin.write(request)
# read the response
response = await proc.stdout.readline()

This can cause a deadlock if the subprocess starts writing the response before it has read the whole request. If the response is large enough, the subprocess will block, waiting for the parent to read some of it and make room in the pipe buffer. However, the parent cannot do so because it is still writing the response and waiting for the write to complete before starting reading. So, the child waits for the parent to read (some of) its response, and the parent waits for the child to finish accepting the request. As both are waiting for the other's current operation to complete, it's a deadlock.

Your code doesn't have that issue simply because your reads and writes are executed in parallel. Since the reader never waits for the writer and vice versa, there is no opportunity for (that kind of) deadlock. If you take a look at how communicate is implemented, you will find that, barring some debug logging, it works pretty much like your code.