Merge and sync stdout and stderr?

2019-02-10 15:59发布

问题:

say I'm running an exe from a python script using:

subprocess.call(cmdArgs,stdout=outf, stderr=errf)

when outf and errf are file descriptors of text files.

is there any way I can generate on top of it a merged and synced text file of both stdout and stderr? it should be formatted with time and source(our/err).

thanks

回答1:

It is a bit tricky, since you need to poll the stdout and stderr file descriptors of the subprocess while it's running, to get accurate timestamps. You also need to chop up the output into a list of lines so the final results can be merged and sorted easily. You could easily merge the two streams as they're read, but that wasn't part of the question.

I wrote it quickly but it could be made cleaner and more compact:

import datetime
import os
import select
import subprocess

class Stream(object):

    def __init__(self, name, impl):
        self._name = name
        self._impl = impl
        self._buf = ''
        self._rows = []

    def fileno(self):
        "Pass-through for file descriptor."
        return self._impl.fileno()

    def read(self, drain=0):
        "Read from the file descriptor. If 'drain' set, read until EOF."
        while self._read() is not None:
            if not drain:
                break

    def _read(self):
        "Read from the file descriptor"
        fd = self.fileno()
        buf = os.read(fd, 4096)
        if not buf:
            return None
        if '\n' not in buf:
            self._buf += buf
            return []

        # prepend any data previously read, then split into lines and format
        buf = self._buf + buf
        tmp, rest = buf.rsplit('\n', 1)
        self._buf = rest
        now = datetime.datetime.now().isoformat()
        rows = tmp.split('\n')
        self._rows += [(now, '%s %s: %s' % (self._name, now, r)) for r in rows]

def run(cmd, timeout=0.1):
    """
    Run a command, read stdout and stderr, prefix with timestamp, and
    return a dict containing stdout, stderr and merged.
    """
    PIPE = subprocess.PIPE
    proc = subprocess.Popen(cmd, stdout=PIPE, stderr=PIPE)
    streams = [
        Stream('stdout', proc.stdout),
        Stream('stderr', proc.stderr)
        ]
    def _process(drain=0):
        res = select.select(streams, [], [], timeout)
        for stream in res[0]:
            stream.read(drain)

    while proc.returncode is None:
        proc.poll()
        _process()
    _process(drain=1)

    # collect results, merge and return
    result = {}
    temp = []
    for stream in streams:
        rows = stream._rows
        temp += rows
        result[stream._name] = [r[1] for r in rows]
    temp.sort()
    result['merged'] = [r[1] for r in temp]
    return result

res = run(['ls', '-l', '.', 'xyzabc'])
for key in ('stdout', 'stderr', 'merged'):
    print 
    print '\n'.join(res[key])
    print '-'*40

Example output:

stdout 2011-03-03T19:30:44.838145: .:
stdout 2011-03-03T19:30:44.838145: total 0
stdout 2011-03-03T19:30:44.838338: -rw-r--r-- 1 pat pat 0 2011-03-03 19:30 bar
stdout 2011-03-03T19:30:44.838518: -rw-r--r-- 1 pat pat 0 2011-03-03 19:30 foo
----------------------------------------

stderr 2011-03-03T19:30:44.837189: ls: cannot access xyzabc: No such file or directory
----------------------------------------

stderr 2011-03-03T19:30:44.837189: ls: cannot access xyzabc: No such file or directory
stdout 2011-03-03T19:30:44.838145: .:
stdout 2011-03-03T19:30:44.838145: total 0
stdout 2011-03-03T19:30:44.838338: -rw-r--r-- 1 pat pat 0 2011-03-03 19:30 bar
stdout 2011-03-03T19:30:44.838518: -rw-r--r-- 1 pat pat 0 2011-03-03 19:30 foo
----------------------------------------


回答2:

You can merge them passing subprocess.STDOUT as the stderr argument for subprocess.Popen, but I don't know if they will be formatted with time and source.