difference between communicate() and .stdin.write,

2019-05-13 20:45发布

I wan to create a pipe between 3 commands:

cat = subprocess.Popen("cat /etc/passwd", stdout=subprocess.PIPE)
grep = subprocess.Popen("grep '<usernamr>'", stdin=cat.stdout, stdout=subprocess.PIPE)
cut = subprocess.Popen("cut -f 3 -d ':'", stdin=grep.stdout, stdout=subprocess.PIPE)
for line in cut.stdout:
    # process each line here

But python documentation says:

Use communicate() rather than .stdin.write, .stdout.read or .stderr.read to avoid deadlocks due to any of the other OS pipe buffers filling up and blocking the child process.

then how should I use cut.stdout? Can someone explain documentation?

2条回答
Animai°情兽
2楼-- · 2019-05-13 20:52

The external process you've spawned may block forever if you are using process.stdin.write without any awareness of possible buffering issues. For example, if the process responds to your 1-line input by writing to its stdout a large (say, 10-100MB) amount of data and you continue to write to its stdin while not receiving this data, than the process will become blocked on write to stdout (stdout is an unnamed pipe and the OS maintains buffers of a particular size for them).

You can try the iterpipes library that deals with these issues by running input and ouput tasks as separate threads.

查看更多
不美不萌又怎样
3楼-- · 2019-05-13 21:12

communicate is designed to prevent a deadlock that wouldn't occur in your application anyway: it is there primarily for the situation where both stdin and stdout on a Popen object are pipes to the calling process, i.e.

subprocess.Popen(["sometool"], stdin=subprocess.PIPE, stdout=subprocess.PIPE)

In your case, you can safely read from cut.stdout. You may use communicate if you find it convenient, but you don't need to.

(Note that subprocess.Popen("/etc/passwd") doesn't make sense; you seem to have forgotten a cat. Also, don't forget shell=True.)

查看更多
登录 后发表回答