say I'm running an exe from a python script using:
subprocess.call(cmdArgs,stdout=outf, stderr=errf)
when outf and errf are file descriptors of text files.
is there any way I can generate on top of it a merged and synced text file of both stdout and stderr?
it should be formatted with time and source(our/err).
thanks
It is a bit tricky, since you need to poll the stdout and stderr file descriptors of the subprocess while it's running, to get accurate timestamps. You also need to chop up the output into a list of lines so the final results can be merged and sorted easily. You could easily merge the two streams as they're read, but that wasn't part of the question.
I wrote it quickly but it could be made cleaner and more compact:
import datetime
import os
import select
import subprocess
class Stream(object):
def __init__(self, name, impl):
self._name = name
self._impl = impl
self._buf = ''
self._rows = []
def fileno(self):
"Pass-through for file descriptor."
return self._impl.fileno()
def read(self, drain=0):
"Read from the file descriptor. If 'drain' set, read until EOF."
while self._read() is not None:
if not drain:
break
def _read(self):
"Read from the file descriptor"
fd = self.fileno()
buf = os.read(fd, 4096)
if not buf:
return None
if '\n' not in buf:
self._buf += buf
return []
# prepend any data previously read, then split into lines and format
buf = self._buf + buf
tmp, rest = buf.rsplit('\n', 1)
self._buf = rest
now = datetime.datetime.now().isoformat()
rows = tmp.split('\n')
self._rows += [(now, '%s %s: %s' % (self._name, now, r)) for r in rows]
def run(cmd, timeout=0.1):
"""
Run a command, read stdout and stderr, prefix with timestamp, and
return a dict containing stdout, stderr and merged.
"""
PIPE = subprocess.PIPE
proc = subprocess.Popen(cmd, stdout=PIPE, stderr=PIPE)
streams = [
Stream('stdout', proc.stdout),
Stream('stderr', proc.stderr)
]
def _process(drain=0):
res = select.select(streams, [], [], timeout)
for stream in res[0]:
stream.read(drain)
while proc.returncode is None:
proc.poll()
_process()
_process(drain=1)
# collect results, merge and return
result = {}
temp = []
for stream in streams:
rows = stream._rows
temp += rows
result[stream._name] = [r[1] for r in rows]
temp.sort()
result['merged'] = [r[1] for r in temp]
return result
res = run(['ls', '-l', '.', 'xyzabc'])
for key in ('stdout', 'stderr', 'merged'):
print
print '\n'.join(res[key])
print '-'*40
Example output:
stdout 2011-03-03T19:30:44.838145: .:
stdout 2011-03-03T19:30:44.838145: total 0
stdout 2011-03-03T19:30:44.838338: -rw-r--r-- 1 pat pat 0 2011-03-03 19:30 bar
stdout 2011-03-03T19:30:44.838518: -rw-r--r-- 1 pat pat 0 2011-03-03 19:30 foo
----------------------------------------
stderr 2011-03-03T19:30:44.837189: ls: cannot access xyzabc: No such file or directory
----------------------------------------
stderr 2011-03-03T19:30:44.837189: ls: cannot access xyzabc: No such file or directory
stdout 2011-03-03T19:30:44.838145: .:
stdout 2011-03-03T19:30:44.838145: total 0
stdout 2011-03-03T19:30:44.838338: -rw-r--r-- 1 pat pat 0 2011-03-03 19:30 bar
stdout 2011-03-03T19:30:44.838518: -rw-r--r-- 1 pat pat 0 2011-03-03 19:30 foo
----------------------------------------
You can merge them passing subprocess.STDOUT
as the stderr
argument for subprocess.Popen
, but I don't know if they will be formatted with time and source.