I'm using a python script as a driver for a hydrodynamics code. When it comes time to run the simulation, I use subprocess.Popen
to run the code, collect the output from stdout and stderr into a subprocess.PIPE
--- then I can print (and save to a log-file) the output information, and check for any errors. The problem is, I have no idea how the code is progressing. If I run it directly from the command line, it gives me output about what iteration its at, what time, what the next time-step is, etc.
Is there a way to both store the output (for logging and error checking), and also produce a live-streaming output?
The relevant section of my code:
ret_val = subprocess.Popen( run_command, stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=True )
output, errors = ret_val.communicate()
log_file.write(output)
print output
if( ret_val.returncode ):
print "RUN failed\n\n%s\n\n" % (errors)
success = False
if( errors ): log_file.write("\n\n%s\n\n" % errors)
Originally I was piping the run_command
through tee
so that a copy went directly to the log-file, and the stream still output directly to the terminal -- but that way I can't store any errors (to my knowlege).
Edit:
Temporary solution:
ret_val = subprocess.Popen( run_command, stdout=log_file, stderr=subprocess.PIPE, shell=True )
while not ret_val.poll():
log_file.flush()
then, in another terminal, run tail -f log.txt
(s.t. log_file = 'log.txt'
).
A good but "heavyweight" solution is to use Twisted - see the bottom.
If you're willing to live with only stdout something along those lines should work:
(If you use read() it tries to read the entire "file" which isn't useful, what we really could use here is something that reads all the data that's in the pipe right now)
One might also try to approach this with threading, e.g.:
Now we could potentially add stderr as well by having two threads.
Note however the subprocess docs discourage using these files directly and recommends to use
communicate()
(mostly concerned with deadlocks which I think isn't an issue above) and the solutions are a little klunky so it really seems like the subprocess module isn't quite up to the job (also see: http://www.python.org/dev/peps/pep-3145/ ) and we need to look at something else.A more involved solution is to use Twisted as shown here: https://twistedmatrix.com/documents/11.1.0/core/howto/process.html
The way you do this with Twisted is to create your process using
reactor.spawnprocess()
and providing aProcessProtocol
that then processes output asynchronously. The Twisted sample Python code is here: https://twistedmatrix.com/documents/11.1.0/core/howto/listings/process/process.pyI think that the
subprocess.communicate
method is a bit misleading: it actually fills the stdout and stderr that you specify in thesubprocess.Popen
.Yet, reading from the
subprocess.PIPE
that you can provide to thesubprocess.Popen
's stdout and stderr parameters will eventually fill up OS pipe buffers and deadlock your app (especially if you've multiple processes/threads that must usesubprocess
).My proposed solution is to provide the stdout and stderr with files - and read the files' content instead of reading from the deadlocking
PIPE
. These files can betempfile.NamedTemporaryFile()
- which can also be accessed for reading while they're being written into bysubprocess.communicate
.Below is a sample usage:
And this is the source code which is ready to be used with as many comments as I could provide to explain what it does:
If you're using python 2, please make sure to first install the latest version of the subprocess32 package from pypi.
Solution 1: Log
stdout
ANDstderr
concurrently in realtimeA simple solution which logs both
stdout
ANDstderr
concurrently in realtime.Solution 2: Create an iterator that returns
stdout
andstderr
line by line, concurrently in realtimeA function
read_popen_pipes()
that allows you to iterate over both pipes (stdout
/stderr
), concurrently in realtime:Here is a class which I'm using in one of my projects. It redirects output of a subprocess to the log. At first I tried simply overwriting the write-method but that doesn't work as the subprocess will never call it (redirection happens on filedescriptor level). So I'm using my own pipe, similar to how it's done in the subprocess-module. This has the advantage of encapsulating all logging/printing logic in the adapter and you can simply pass instances of the logger to
Popen
:subprocess.Popen("/path/to/binary", stderr = LogAdapter("foo"))
If you don't need logging but simply want to use
print()
you can obviously remove large portions of the code and keep the class shorter. You could also expand it by an__enter__
and__exit__
method and callfinished
in__exit__
so that you could easily use it as context.You have two ways of doing this, either by creating an iterator from the
read
orreadline
functions and do:or
Or you can create a
reader
and awriter
file. Pass thewriter
to thePopen
and read from thereader
This way you will have the data written in the
test.log
as well as on the standard output.The only advantage of the file approach is that your code doesn't block. So you can do whatever you want in the meantime and read whenever you want from the
reader
in a non-blocking way. When you usePIPE
,read
andreadline
functions will block until either one character is written to the pipe or a line is written to the pipe respectively.All of the above solutions I tried failed either to separate stderr and stdout output, (multiple pipes) or blocked forever when the OS pipe buffer was full which happens when the command you are running outputs too fast (there is a warning for this on python poll() manual of subprocess). The only reliable way I found was through select, but this is a posix-only solution: