Python Object Lifetime Characteristics

2019-04-11 17:03发布

Note : if you know of any (non-elaborate) library code that does what I want please enlighten a C/C++ programmer, I'll accept that as answer.

I have a global variable set to an instance of the following class. It's purpose is to allow me to set some manual interruption points to place some quick and dirty printf style debug points in a scrapy spider (I specifically need to break when certain criteria are met to tune a parser, there are some extremely rare input data anomalies) -- Adapted from this.

Os is OS X 10.8.

import termios, fcntl, sys, os

class DebugWaitKeypress(object):
    def __init__(self):
        self.fd = sys.stdin.fileno()
        self.oldterm = termios.tcgetattr(self.fd)
        self.newattr = termios.tcgetattr(self.fd)
        self.newattr[3] = self.newattr[3] & ~termios.ICANON & ~termios.ECHO
        termios.tcsetattr(self.fd, termios.TCSANOW, self.newattr)

        self.oldflags = fcntl.fcntl(self.fd, fcntl.F_GETFL)
        fcntl.fcntl(self.fd, fcntl.F_SETFL, self.oldflags | os.O_NONBLOCK)

    def wait(self):
        sys.stdin.read(1)

    def __del__(self):
        print "called del"
        termios.tcsetattr(self.fd, termios.TCSAFLUSH, self.oldterm)
        fcntl.fcntl(self.fd, fcntl.F_SETFL, self.oldflags)

When I press Ctrl-C and the process is unwinding I get the following exception :

Exception AttributeError: "'NoneType' object has no attribute 'tcsetattr'" in <bound method DebugWaitKeypress.__del__ of <hon.spiders.custom_debug.DebugWaitKeypress object at 0x108985e50>> ignored

I'm missing something about the mechanics of object lifetimes I guess ? How do remedy the situation. AFAIK any class instances should be destroyed before the imported code does, no ? in reverse order of declaration/definition.

I would just ignore this if the terminal wasn't screwed up after the process exits :D

edit:

Delian's comment on seth's answer led me to understand that I need to use a C main() like function, or any other function/generator which dominates as a root function and initialise the context there. This way When the process is going down the __exit__ method of the context manager will get called. And I won't have to reprogram the terminal stream on each wait() call.

Although the cost of the reprogramming is potentially immaterial, it is good to know how one would these essential C/C++ semantics in python.

edit 2:

Twisted (which scrapy uses) goes apeshit when messing with stdin. So I had to solve the problem with file IO.

2条回答
爷的心禁止访问
2楼-- · 2019-04-11 17:26

Long story short: __del__ is useless for this purpose (and pretty much any other purpose; you should probably forget it exists). If you want deterministic cleanup, use a context manager.

AFAIK any class instances should be destroyed before the imported code does, no ? in reverse order of declaration/definition.

That's C++. Forget it. Python does not care about this, in fact it does not even care about most things which are requirements for doing it. There is no such thing as a declaration in the entire Python language, and module-level variables are stored in what is essentially an unordered associative array. Variables do not store objects, they store references (which are not C++ references, they're basically pointers without pointer arithmetic) -- objects are on the heap and don't know a thing about variables, bindings, statements, or order of statements.

Moreover, when an objects is garbage collected, and whether it is gc'd at all, is undefined. You get a mostly deterministic picture in CPython due to reference counting, but even there it falls down the second you have cycles. The consequence is that __del__ may be called at any point in time (including when half of the module is already torn down) or not at all. Multiple objects defining __del__ referencing each other are also trouble, although some GCs try hard to do the right thing.

Bottom line is, you can assume very little at the time __del__ runs, so you can't do very much. You get a last shot at disposing resources that should have been cleaned up via another method but weren't, and that's pretty much it. Rule of thumb: Never rely on it for anything mandatory.

Instead, create a context manager and use it via with. You get deterministic cleanup, without worrying about object lifetime. Because, truth to be told, object lifetime and resource lifetime are two entirely different things, and only entangled in C++ because it's the best way to do resource management in that environment. In Python, RAII does not apply, instead we have this:

with <context manager> as var:
    # do something
# "context closed", whatever that means - for resources, usually cleanup

By the way, you can define it far more conveniently via contextlib (quickly transliterated from your version, may contain errors or ugliness):

from contextlib import contextmanager


@contextmanager
def debug_wait_keypress():
    fd = sys.stdin.fileno()
    oldterm = termios.tcgetattr(fd)
    newattr = termios.tcgetattr(fd)
    newattr[3] = newattr[3] & ~termios.ICANON & ~termios.ECHO
    termios.tcsetattr(fd, termios.TCSANOW, newattr)
    oldflags = fcntl.fcntl(fd, fcntl.F_GETFL)
    fcntl.fcntl(fd, fcntl.F_SETFL, oldflags | os.O_NONBLOCK)
    try:
        yield
    finally:
        termios.tcsetattr(fd, termios.TCSAFLUSH, oldterm)
        fcntl.fcntl(fd, fcntl.F_SETFL, oldflags)

Your wait method becomes a free function.

查看更多
欢心
3楼-- · 2019-04-11 17:37

If __del__ is called, it happens sometime after the object's reference count is zero, and possibly not until the program ends, and not in any particular order. You also can't depend on anything external (especially globals) being available in __del__.

In your case, python cleaned up the reference to the termios module before it called DebugWaitKeyPress.__del__. That's why you're getting the 'NoneType' object has no attribute 'tcsetattr' message. termios is None by the time you try to use it.

I would guess you would be better off implementing a context manager, and put your __del__ code in __exit__.

Then you would be able to say something like:

with DebugWaitKeypress(...) as thing:
    do_something_with_it(thing)
# here, __exit__() is called to do cleanup

From the object.__del__ docs:

Due to the precarious circumstances under which __del__() methods are invoked, exceptions that occur during their execution are ignored, and a warning is printed to sys.stderr instead. Also, when __del__() is invoked in response to a module being deleted (e.g., when execution of the program is done), other globals referenced by the __del__() method may already have been deleted or in the process of being torn down (e.g. the import machinery shutting down). For this reason, __del__() methods should do the absolute minimum needed to maintain external invariants. Starting with version 1.5, Python guarantees that globals whose name begins with a single underscore are deleted from their module before other globals are deleted; if no other references to such globals exist, this may help in assuring that imported modules are still available at the time when the __del__() method is called.

查看更多
登录 后发表回答