python read() from stdout much slower than reading

2020-08-10 08:14发布

问题:

I have a python SubProcess call that runs an executable and pipes the output to my subprocess stdout.

In cases where the stdout data is relatively small (~2k lines), the performance between reading line by line and reading as a chunk (stdout.read()) is comparable...with stdout.read() being slightly faster.

Once the data gets to be larger (say 30k+ lines), the performance for reading line by line is significantly better.

This is my comparison script:

proc=subprocess.Popen(executable,stdout=subprocess.PIPE)
tic=time.clock()
for line in (iter(proc.stdout.readline,b'')):
    tmp.append(line)
print("line by line = %.2f"%(time.clock()-tic))

proc=subprocess.Popen(executable,stdout=subprocess.PIPE)
tic=time.clock()
fullFile=proc.stdout.read()
print("slurped = %.2f"%(time.clock()-tic))

And these are the results for a read of ~96k lines (or 50mb of on disk memory):

line by line = 5.48
slurped = 153.03

I am unclear why the performance difference is so extreme. My expectation is that the read() version should be faster than storing the results line by line. Of course, I was expecting faster line by line results in practical case where there is significant per line processing that could be done during the read.

Can anyone give me insight into the read() performance cost?

回答1:

This is not just Python, reading by chars without buffering is always slower then reading-in lines or big chunks.

Consider these two simple C programs:

[readchars.c]

#include <stdlib.h>
#include <stdio.h>
#include <errno.h>

int main(void) {
        FILE* fh = fopen("largefile.txt", "r");
        if (fh == NULL) {
                perror("Failed to open file largefile.txt");
                exit(1);
        }

        int c;
        c = fgetc(fh);
        while (c != EOF) {
                c = fgetc(fh);
        }

        return 0;
}

[readlines.c]

#include <stdlib.h>
#include <stdio.h>
#include <errno.h>

int main(void) {
        FILE* fh = fopen("largefile.txt", "r");
        if (fh == NULL) {
                perror("Failed to open file largefile.txt");
                exit(1);
        }

        char* s = (char*) malloc(120);
        s = fgets(s, 120, fh);
        while ((s != NULL) && !feof(fh)) {
                s = fgets(s, 120, fh);
        }

        free(s);

        return 0;
}

And their results (YMMW, largefile.txt was ~200MB text file):

$ gcc readchars.c -o readchars
$ time ./readchars            
./readchars  1.32s user 0.03s system 99% cpu 1.350 total
$ gcc readlines.c -o readlines
$ time ./readlines            
./readlines  0.27s user 0.03s system 99% cpu 0.300 total


回答2:

Try adding a bufsize option to your Popen call and see if it makes a difference:

proc=subprocess.Popen(executable, bufsize=-1, stdout=subprocess.PIPE)

Popen includes an option to set the buffer size for reading input. bufsize defaults to 0, which means unbuffered input. Any other value means a buffer of approximately that size. A negative value means to use the system default, which means fully buffered input.

The Python docs include this note:

Note: if you experience performance issues, it is recommended that you try to enable buffering by setting bufsize to either -1 or a large enough positive value (such as 4096).



回答3:

I don't get that behaviour at all.

import subprocess
import time


executable = ["cat", "data"]

proc=subprocess.Popen(executable,stdout=subprocess.PIPE)
tic=time.clock()
tmp = []
for line in (iter(proc.stdout.readline,b'')):
    tmp.append(line)
print("line by line = %.2f"%(time.clock()-tic))

proc=subprocess.Popen(executable,stdout=subprocess.PIPE)
tic=time.clock()
fullFile=proc.stdout.read()
print("slurped = %.2f"%(time.clock()-tic))

Data is text.

pts/0$ ll data
-rw-r--r-- 1 javier users 18M feb 21 20:53 data

pts/0$ wc -l data
169866 data

Result:

pts/0$ python3 a.py 
line by line = 0.08
slurped = 0.01

Python 2 is much slower than Python 3!

pts/0$ python2 a.py 
line by line = 4.45
slurped = 0.01

Perhaps it depends on the sub-process?



回答4:

I've had spotty results with the bufsize, I run a continuous ping script that logs replies, and I need it to run non stop, this will hang every few days and my solution was to write a separate script to watch the tasklist and kill any ping task that takes more than 10 seconds. See below

import subprocess
import time

CREATE_NO_WINDOW = 0x08000000
previous_id = ''

while 0!=1:
    command = subprocess.Popen(['tasklist'], stdout=subprocess.PIPE, 
              shell=False, creationflags = CREATE_NO_WINDOW)
    reply = str(command.communicate()[0]).split('Ko')
    for item in reply:
        if 'PING.EXE' in item:
            print(item.split(' ')[0][4:]+' '+item.split(' ')[22])
        if item.split(' ')[22] != previous_id:
            previous_id = item.split(' ')[22]
            print('New ping detected, system is healthy')
        else:
            print('Same ping active for 10 seconds, killing')
            command = subprocess.Popen(['taskkill','/f','/im','PING.EXE'], stdout=subprocess.PIPE, shell=False, creationflags = CREATE_NO_WINDOW)
            err_log=open('errors.txt','w')
    time.sleep(10)

this runs in parallel and there is very little chance of both processes hanging at the same time. all you need to do is catch any errors resulting from the loss of the pipe in your main script.