Python destructor basing on try/finally + yield?

2020-06-25 04:01发布

问题:

I've been testing a dirty hack inspired by this http://docs.python.org/2/library/contextlib.html . The main idea is to bring try/finally idea onto class level and get reliable and simple class destructor.

class Foo():
  def __init__(self):
    self.__res_mgr__ = self.__acquire_resources__()
    self.__res_mgr__.next()

  def __acquire_resources__(self):
    try:
      # Acquire some resources here
      print "Initialize"
      self.f = 1
      yield
    finally:
      # Release the resources here
      print "Releasing Resources"
      self.f = 0

f = Foo()
print "testing resources"
print f.f

But it always gives me:

Initialize
testing resources
1

and never "Releasing Resources". I'm basing my hope on:

As of Python version 2.5, the yield statement is now allowed in the try clause of a try ... finally construct. If the generator is not resumed before it is finalized (by reaching a zero reference count or by being garbage collected), the generator-iterator’s close() method will be called, allowing any pending finally clauses to execute. Source link

But it seems when the class member is being garbage collected together with the class their ref counts don't decrease, so as a result generators close() and thus finally is never called. As for the second part of the quote

"or by being garbage collected"

I just don't know why it's not true. Any chance to make this utopia work? :)

BTW this works on module level:

def f():
  try:
    print "ack"
    yield
  finally:
    print "release"

a = f()
a.next()
print "testing"

Output will be as I expect:

ack
testing
release

NOTE: In my task I'm not able to use WITH manager because I'm releasing the resource inside end_callback of the thread (it will be out of any WITH). So I wanted to get a reliable destructor for cases when callback won't be called for some reason

回答1:

The problem you are having is caused by a reference cycle and an implicit __del__ defined on your generator (it's so implicit, CPython doesn't actually show __del__ when you introspect, because only the C level tp_del exists, no Python-visible __del__ is created). Basically, when a generator has a yield inside:

  • A try block, or equivalently
  • A with block

it has an implicit __del__-like implementation. On Python 3.3 and earlier, if a reference cycle contains an object whose class implements __del__ (technically, has tp_del in CPython), unless the cycle is manually broken, the cyclic garbage collector cannot clean it up, and just sticks it in gc.garbage (import gc to gain access), because it doesn't know which objects (if any) must be collected first to clean up "nicely".

Because your class's __acquire_resources__(self) contains a reference to the instance's self, you form a reference cycle:

self -> self.__res_mgr__ (generator object) -> generator frame (referencing locals which includes) -> self

Because of this reference cycle, and the fact that the generator has a try/finally in it (creating tp_del equivalent to __del__), the cycle is uncollectable, and your finally block never gets executed unless you manually advance self.__res_mgr__ (which defeats the whole purpose).

You experiment happens to display this problem automatically because the reference cycle is implicit/automatic, but any accidental reference cycle where an object in the cycle has a class with __del__ will trigger the same problem, so even if you just did:

class Foo():
    def __init__(self):
        # Acquire some resources here
        print "Initialize"
        self.f = 1

    def __del__(self):
        # Release the resources here
        print "Releasing Resources"
        self.f = 0

if the "resources" involved could conceivably lead to a reference cycle with an instance of Foo, you'd have the same problem.

The solution here is one or both of:

  1. Make your class a context manager so users provide the information necessary for deterministic finalization (by using with blocks) as well as providing an explicit cleanup method (e.g. close) for when with blocks aren't feasible (part of another object's state that is cleaned up through its own resource management). This is also the only way to provide deterministic cleanup on most non-CPython interpreters where reference counting semantics have never been used (so all finalizers are called non-deterministically, if at all)
  2. Move to Python 3.4 or higher, where PEP 442 resolves the issue with uncollectable cyclic garbage (it's technically still possible to produce such cycles on CPython, but only via third party extensions that continue to use tp_del instead of updating to use the tp_finalize slot that allows cyclic garbage to be cleaned properly). It's still non-deterministic cleanup (if a reference cycle exists, you're waiting on the cyclic gc to run, sometime), but it's possible, where pre-3.4, cyclic garbage of this sort could not be cleaned up at all.