How to interact with python's subprocess as a

2019-08-10 05:48发布

问题:

I need to implement a filter in python, which pick out specific outputs from a Linux command-line dictionary tool. I need to:

  1. Get a set of words from a file
  2. Lookup each word: 1) if word not contains, skip it; 2) else if it is a verb, save the definition.

To test the code, I wrote two python file:

# name.py
import sys
while True:
    print 'name => Q: what is your name?'
    sys.stdout.flush()
    name = raw_input()
    if name == 'Exit':
        break
    print 'name => A: your name is ' + name
    sys.stdout.flush()

# test.py
import subprocess
child = subprocess.Popen(r'python name.py', 
            stdin = subprocess.PIPE, 
            stdout = subprocess.PIPE,
            stderr = subprocess.STDOUT, 
            shell = True)
commandlist = ['Luke\n', 'Mike\n', 'Jonathan\n', 'Exit\n']
for command in commandlist:
    child.stdin.write(command)
    child.stdin.flush()
    output = child.stdout.readline()
    print 'From PIPE: ' + output
while child.poll() is None:
    print 'NOT POLL: ' + child.stdout.readline()
child.wait()

The output is

From PIPE: name => Q: what is your name?

From PIPE: name => A: your name is Luke

From PIPE: name => Q: what is your name?

From PIPE: name => A: your name is Mike
# these lines need to start with "From PIPE" ... 
NOT POLL: name => Q: what is your name?

NOT POLL: name => A: your name is Jonathan

NOT POLL: name => Q: what is your name?

NOT POLL: 

The later output is read during the while loop rather than the for loop in test.py. What is the reason?

Due to the demand, I need to get the whole output each time it input a new command. It seems like a dialog session. So the subprocess.communicate() is useless there, for it always terminates current subprocess. How to implement this demand?

回答1:

The basic reason subprocess insists you use .communicate() is because it's possible for deadlock to occur otherwise. Suppose you're writing to the process's stdin, while the process is writing to its stdout. If the pipe buffers fill up, the writes will block until a read occurs. Then you're both waiting for each other and no progress can be made. There are several ways to deal with this:

  1. Use separate threads. Assign one to stdin and the other to stdout. That way, if one pipe blocks, you're still servicing the other.
  2. Use select to multiplex over the pipes. Only interact with pipes which are ready for you. You should also enable O_NONBLOCK on the pipes using fcntl, so you don't accidentally fill the buffers. Used correctly, this will prevent the pipes from ever blocking, so you can't deadlock. This doesn't work under Windows, because you can only do select on sockets there.


回答2:

In your specific case, the issue is that for each two lines that the child process prints, your parent process reads only one line. If you pass more names then eventually your processes deadlock after the OS pipe buffers have been filled up as @Kevin explained.

To fix it, just add the second child.stdout.readline() to read the question before writing the name to the child process.

For example, here's parent.py script:

#!/usr/bin/env python
from __future__ import print_function
import sys
from subprocess import Popen, PIPE

child = Popen([sys.executable, '-u', 'child.py'],
              stdin=PIPE, stdout=PIPE,
              bufsize=1, universal_newlines=True)
commandlist = ['Luke', 'Mike', 'Jonathan', 'Exit']
for command in commandlist:
    print('From PIPE: Q:', child.stdout.readline().rstrip('\n'))
    print(command, file=child.stdin)
    #XXX you might need it to workaround bugs in `subprocess` on Python 3.3
    #### child.stdin.flush()
    if command != 'Exit':
        print('From PIPE: A:', child.stdout.readline().rstrip('\n'))
child.stdin.close() # no more input
assert not child.stdout.read() # should be empty
child.stdout.close()
child.wait()

Output

From PIPE: Q: name => Q: what is your name?
From PIPE: A: name => A: your name is Luke
From PIPE: Q: name => Q: what is your name?
From PIPE: A: name => A: your name is Mike
From PIPE: Q: name => Q: what is your name?
From PIPE: A: name => A: your name is Jonathan
From PIPE: Q: name => Q: what is your name?

The code works but it is still fragile if the output of the child.py processes may change then the deadlock may reappear. Many issues to control an interactive process are solved by pexpect module. See also the code example linked in this comment.

I've changed child.py to work on both Python 2 and 3:

#!/usr/bin/env python
try:
    raw_input = raw_input
except NameError: # Python 3
    raw_input = input

while True:
    print('name => Q: what is your name?')
    name = raw_input()
    if name == 'Exit':
        break
    print('name => A: your name is ' + name)