Basically I want to learn how to use the stdout
of one subprocess
(say proc1
) as stdin
of 2 or more other subprocess
es (say proc2
& proc3
) in python.
Hi,
I need to zcat
a .gz file and use the output sent to subprocess.PIPE
for both cksum
(unix utility) and to line count.
I can do it in bash like this...
[hashroot@dev_server 12]$ zcat ABC_C_TPM_26122014.data.gz | tee >(wc -l) >(cksum)| tail -2
2020090579 112180
586
I want to do the same in python.
As soon as I do this...
>>> import subprocess
>>> import os
>>> fl123 = 'ABC_C_TPM_26122014.data.gz'
>>> pqr123 = subprocess.Popen(['zcat', fl123], stdout=subprocess.PIPE)
>>> subprocess.check_output(['cksum'], stdin=pqr123.stdout)
b'4286000649 256100 \n'
Now the PIPE
is empty so how will I get line count till I don't do zcat
again.
I can very well do it by running zcat twice in subprocess, and redirecting the first zcat
output to wc -l and the second zcat
's output to cksum
. But zcat
is disk IO based and is slow. So I want to avoid it.
A simple way to implement the
tee
command in Python is to write to the subprocesses manually:If the lines in the input can be large then you could read the input in chunks:
chunk = input_file.read(chunk_size)
instead of line by line (b'\n'
).