log syntax errors and uncaught exceptions for a py

2019-01-20 01:03发布

问题:

The Problem

I've been trying to write a program that logs uncaught exceptions and syntax errors for a subprocess. Easy, right? Just pipe stderr to the right place.

However, the subprocess is another python program- I'll call it test.py- that needs to run as if its output/errors are not being captured. That is, running the logger program needs to seem like the user has just run python test.py as normal.

Further complicating the issue is the problem that raw_input actually gets sent to stderr if readline is not used. Unfortunately, I can't just import readline, since I don't have control over the files that are being run using my error logger.

Notes:

  • I am fairly restricted on the machines that this code will run on. I can't install pexpect or edit the *customize.py files (since the program will be run by a lot of different users). I really feel like there should be a stdlib solution anyway though...
  • This only has to work on macs.
  • The motivation for this is that I'm part of a team researching the errors that new programmers get.

What I've Tried

I've tried the following methods, without success:

  • just using tee as in the question How do I write stderr to a file while using "tee" with a pipe? (failed to produce raw_input prompts); python implementations of tee that I found in several SO questions had similar issues
  • overwriting sys.excepthook (failed to make it work for a subprocess)
  • this question's top answer seemed promising, but it failed to display raw_input prompts correctly.
  • the logging module seems useful for actually writing to a log file, but doesn't seem to get at the crux of the issue
  • custom stderr readers
  • endless googling

回答1:

The tee-based answer that you've linked is not very suitable for your task. Though you could fix "raw_input() prompts" issue by using -u option to disable buffering:

errf = open('err.txt', 'wb') # any object with .write() method
rc = call([sys.executable, '-u', 'test.py'], stderr=errf, 
          bufsize=0, close_fds=True)
errf.close()

A more suitable solution might be based on pexpect or pty, example.

running the logger program needs to seem like the user has just run python test.py as normal.

#!/usr/bin/env python
import sys
import pexpect

with open('log', 'ab') as fout:
    p = pexpect.spawn("python test.py")
    p.logfile = fout
    p.interact()

You don't need to install pexpect it is pure Python you could put it along-side your code.

Here's a tee-based analog (test.py is run non-interactively):

#!/usr/bin/env python
import sys
from subprocess import Popen, PIPE, STDOUT
from threading  import Thread

def tee(infile, *files):
    """Print `infile` to `files` in a separate thread."""
    def fanout(infile, *files):
        flushable = [f for f in files if hasattr(f, 'flush')]
        for c in iter(lambda: infile.read(1), ''):
            for f in files:
                f.write(c)
            for f in flushable:
                f.flush()
        infile.close()
    t = Thread(target=fanout, args=(infile,)+files)
    t.daemon = True
    t.start()
    return t

def call(cmd_args, **kwargs):
    stdout, stderr = [kwargs.pop(s, None) for s in 'stdout', 'stderr']
    p = Popen(cmd_args,
              stdout=None if stdout is None else PIPE,
              stderr=None if stderr is None else (
                   STDOUT if stderr is STDOUT else PIPE),
              **kwargs)
    threads = []
    if stdout is not None: 
        threads.append(tee(p.stdout, stdout, sys.stdout))
    if stderr is not None and stderr is not STDOUT: 
        threads.append(tee(p.stderr, stderr, sys.stderr))
    for t in threads: t.join() # wait for IO completion
    return p.wait()

with open('log','ab') as file:
    rc = call([sys.executable, '-u', 'test.py'], stdout=file, stderr=STDOUT,
              bufsize=0, close_fds=True)

It is necessary to merge stdout/stderr due to it is unclear where raw_input(), getpass.getpass() might print their prompts.

In this case the threads are not necessary too:

#!/usr/bin/env python
import sys
from subprocess import Popen, PIPE, STDOUT

with open('log','ab') as file:
    p = Popen([sys.executable, '-u', 'test.py'],
              stdout=PIPE, stderr=STDOUT,
              close_fds=True,
              bufsize=0)
    for c in iter(lambda: p.stdout.read(1), ''):
        for f in [sys.stdout, file]:
            f.write(c)
            f.flush()
    p.stdout.close()
    rc = p.wait()

Note: the last example and tee-based solution don't capture getpass.getpass() prompt, but pexpect and pty-based solution do:

#!/usr/bin/env python
import os
import pty
import sys

with open('log', 'ab') as file:
    def read(fd):
        data = os.read(fd, 1024)
        file.write(data)
        file.flush()
        return data

    pty.spawn([sys.executable, "test.py"], read)

I don't know whether pty.spawn() works on macs.



回答2:

Based on @nneonneo 's advice in the question comments, I made this program that seems to get the job done. (Note that currently, the name of the logger file has to by "pylog" for the errors to be printed correctly to the end user.)

#!/usr/bin/python

'''
This module logs python errors.
'''

import socket, os, sys, traceback

def sendError(err):
    # log the error (in my actual implementation, this sends the error to a database)
    with open('log','w') as f:
        f.write(err)


def exceptHandler(etype, value, tb):
    """An additional wrapper around our custom exception handler, to prevent errors in
       this program from being seen by end users."""
    try:
        subProgExceptHandler(etype, value, tb)
    except:
        sys.stderr.write('Sorry, but there seems to have been an error in pylog itself. Please run your program using regular python.\n')

def subProgExceptHandler(etype, value, tb):
    """A custom exception handler that both prints error and traceback information in the standard
       Python format, as well as logs it."""
    import linecache

    errorVerbatim = ''

    # The following code mimics a traceback.print_exception(etype, value, tb) call.
    if tb:
        msg = "Traceback (most recent call last):\n"
        sys.stderr.write(msg)
        errorVerbatim += msg

        # The following code is a modified version of the trackeback.print_tb implementation from
        # cypthon 2.7.3
        while tb is not None:
            f = tb.tb_frame                                                      
            lineno = tb.tb_lineno                                                  
            co = f.f_code                                                        
            filename = co.co_filename                                              
            name = co.co_name
            # Filter out exceptions from pylog itself (eg. execfile).
            if not "pylog" in filename:
                msg = '  File "%s", line %d, in %s\n' % (filename, lineno, name)
                sys.stderr.write(msg)       
                errorVerbatim += msg
                linecache.checkcache(filename)                                         
                line = linecache.getline(filename, lineno, f.f_globals)                
                if line: 
                    msg = '    ' + line.strip() + '\n'
                    sys.stderr.write(msg)
                    errorVerbatim += msg
            tb = tb.tb_next                                           

    lines = traceback.format_exception_only(etype, value)
    for line in lines:
        sys.stderr.write(line)
        errorVerbatim += line

    # Send the error data to our database handler via sendError.
    sendError(errorVerbatim)

def main():
    """Executes the program specified by the user in its own sandbox, then sends
       the error to our database for logging and analysis."""
    # Get the user's (sub)program to run.
    try:
        subProgName = sys.argv[1]
        subProgArgs = sys.argv[3:]
    except:
        print 'USAGE: ./pylog FILENAME.py *ARGS'
        sys.exit()

    # Catch exceptions by overriding the system excepthook.
    sys.excepthook = exceptHandler
    # Sandbox user code exeuction to its own global namespace to prevent malicious code injection.
    execfile(subProgName, {'__builtins__': __builtins__, '__name__': '__main__', '__file__': subProgName, '__doc__': None, '__package__': None})

if __name__ == '__main__':
    main()