Why is piping output of subprocess so unreliable w

2019-07-07 06:34发布

问题:

(Windows)

I wrote some Python code that calls the program SoX (subprocess module), which outputs the progress on STDERR, if you specify it to do so. I want to get the percentage status from the output. If I call it not from the Python script, it starts immediately and has a smooth progression till 100%.

If I call it from the Python script, it lasts a few seconds till it starts and then it alternates between slow output and fast output. Although I read char by char sometimes there RUSHES out a large block. So I don't understand why at other times I can watch the characters getting more one by one. (It generates 15KiB of data in my test, by the way.)

I have tested the same with mkvmerge and mkvextract. They output percentages, too. Reading STDOUT there is smooth.

This is so unreliable! How can I make the reading of sox's stderr stream smoother, and perhaps prevent the delay at the beginning?


How I call and read:

process = subprocess.Popen('sox_call_dummy.bat', stderr = subprocess.PIPE, stdout = subprocess.PIPE)
while True:
    char = process.stderr.read(1).encode('string-escape')
    sys.stdout.write(char)

回答1:

As per this closely related thread: Unbuffered read from process using subprocess in Python

process = subprocess.Popen('sox_call_dummy.bat', 
                stderr = subprocess.PIPE, bufsize=0)
while True:
    line = process.stderr.readline()
    if not line: 
        break
    print line

Since you aren't reading stdout, I don't think you need a pipe for it.

If you want to try reading char by char as in your original example, try adding a flush each time:

sys.stdout.write(char)
sys.stdout.flush()

Flushing the stdout every time you write is the manual equivalent of disabling buffering for the python process: python.exe -u <script> or setting the env variable PYTHONUNBUFFERED=1