I need to implement a filter in python, which pick out specific outputs from a Linux command-line dictionary tool. I need to:
- Get a set of words from a file
- Lookup each word: 1) if word not contains, skip it; 2) else if it is a verb, save the definition.
To test the code, I wrote two python file:
# name.py
import sys
while True:
print 'name => Q: what is your name?'
sys.stdout.flush()
name = raw_input()
if name == 'Exit':
break
print 'name => A: your name is ' + name
sys.stdout.flush()
# test.py
import subprocess
child = subprocess.Popen(r'python name.py',
stdin = subprocess.PIPE,
stdout = subprocess.PIPE,
stderr = subprocess.STDOUT,
shell = True)
commandlist = ['Luke\n', 'Mike\n', 'Jonathan\n', 'Exit\n']
for command in commandlist:
child.stdin.write(command)
child.stdin.flush()
output = child.stdout.readline()
print 'From PIPE: ' + output
while child.poll() is None:
print 'NOT POLL: ' + child.stdout.readline()
child.wait()
The output is
From PIPE: name => Q: what is your name?
From PIPE: name => A: your name is Luke
From PIPE: name => Q: what is your name?
From PIPE: name => A: your name is Mike
# these lines need to start with "From PIPE" ...
NOT POLL: name => Q: what is your name?
NOT POLL: name => A: your name is Jonathan
NOT POLL: name => Q: what is your name?
NOT POLL:
The later output is read during the while
loop rather than the for
loop in test.py
. What is the reason?
Due to the demand, I need to get the whole output each time it input a new command. It seems like a dialog session. So the subprocess.communicate()
is useless there, for it always terminates current subprocess. How to implement this demand?
The basic reason subprocess
insists you use .communicate()
is because it's possible for deadlock to occur otherwise. Suppose you're writing to the process's stdin, while the process is writing to its stdout. If the pipe buffers fill up, the writes will block until a read occurs. Then you're both waiting for each other and no progress can be made. There are several ways to deal with this:
- Use separate threads. Assign one to stdin and the other to stdout. That way, if one pipe blocks, you're still servicing the other.
- Use
select
to multiplex over the pipes. Only interact with pipes which are ready for you. You should also enable O_NONBLOCK
on the pipes using fcntl
, so you don't accidentally fill the buffers. Used correctly, this will prevent the pipes from ever blocking, so you can't deadlock. This doesn't work under Windows, because you can only do select
on sockets there.
In your specific case, the issue is that for each two lines that the child process prints, your parent process reads only one line. If you pass more names then eventually your processes deadlock after the OS pipe buffers have been filled up as @Kevin explained.
To fix it, just add the second child.stdout.readline()
to read the question before writing the name to the child process.
For example, here's parent.py
script:
#!/usr/bin/env python
from __future__ import print_function
import sys
from subprocess import Popen, PIPE
child = Popen([sys.executable, '-u', 'child.py'],
stdin=PIPE, stdout=PIPE,
bufsize=1, universal_newlines=True)
commandlist = ['Luke', 'Mike', 'Jonathan', 'Exit']
for command in commandlist:
print('From PIPE: Q:', child.stdout.readline().rstrip('\n'))
print(command, file=child.stdin)
#XXX you might need it to workaround bugs in `subprocess` on Python 3.3
#### child.stdin.flush()
if command != 'Exit':
print('From PIPE: A:', child.stdout.readline().rstrip('\n'))
child.stdin.close() # no more input
assert not child.stdout.read() # should be empty
child.stdout.close()
child.wait()
Output
From PIPE: Q: name => Q: what is your name?
From PIPE: A: name => A: your name is Luke
From PIPE: Q: name => Q: what is your name?
From PIPE: A: name => A: your name is Mike
From PIPE: Q: name => Q: what is your name?
From PIPE: A: name => A: your name is Jonathan
From PIPE: Q: name => Q: what is your name?
The code works but it is still fragile if the output of the child.py
processes may change then the deadlock may reappear. Many issues to control an interactive process are solved by pexpect
module. See also the code example linked in this comment.
I've changed child.py
to work on both Python 2 and 3:
#!/usr/bin/env python
try:
raw_input = raw_input
except NameError: # Python 3
raw_input = input
while True:
print('name => Q: what is your name?')
name = raw_input()
if name == 'Exit':
break
print('name => A: your name is ' + name)