Will a Python generator be garbage collected if it

2019-03-23 13:32发布

问题:

When a generator is not used any more, it should be garbage collected, right? I tried the following code but I am not sure which part I was wrong.

import weakref
import gc

def countdown(n):
    while n:
        yield n
        n-=1

cd = countdown(10)
cdw = weakref.ref(cd)()
print cd.next()
gc.collect()
print cd.next()
gc.collect()
print cdw.next()

On the second last line, I called garbage collector and since there is no call to cd any more. gc should free cd right. But when I call cdw.next(), it is still printing 8. I tried a few more cdw.next(), it could successfully print all the rest until StopIteration.

I tried this because I wanted to understand how generator and coroutine work. On slide 28 of David Beazley's PyCon presentation "A Curious Course on Coroutines and Concurrency", he said that a coroutine might run indefinitely and we should use .close() to shut it down. Then he said that garbage collector will call .close(). In my understanding, once we called .close() ourselves, gc will call .close() again. Will gc receive a warning that it can't call .close() on an already closed coroutine?

Thanks for any inputs.

回答1:

Due to the dynamic nature of python, the reference to cd isn't freed until you reach the end of the current routine because (at least) the Cpython implementation of python doesn't "read ahead". (If you don't know what python implementation you're using, it's almost certainly "Cpython"). There are a number of subtleties that would make that virtually impossible for the interpreter to determine whether an object should be free if it still exists in the current namespace in the general case (e.g. you can still reach it by a call to locals()).

In some less general cases, other python implementations may be able to free an object before the end of the current stack frame, but Cpython doesn't bother.

Try this code instead which demonstrates that the generator is free to be cleaned up in Cpython:

import weakref
def countdown(n):
    while n:
        yield n
        n-=1

def func():
    a = countdown(10)
    b = weakref.ref(a)
    print next(a)
    print next(a)
    return b

c = func()
print c()

Objects (including generators) are garbage collected when their reference count reaches 0 (in Cpython -- Other implementations may work differently). In Cpython, reference counts are only decremented when you see a del statement, or when an object goes out of scope because the current namespace changes.

The important thing is that once there are no more references to an object, it is free to be cleaned up by the garbage collector. The details of how the implementation determines that there are no more references are left to the implementers of the particular python distribution you're using.



回答2:

In your example, the generator won't get garbage collected until the end of the script. Python doesn't know if you're going to be using cd again, so it can't throw it away. To put it precisely, there's still a reference to your generator in the global namespace.

A generator will get GCed when its reference count drops to zero, just like any other object. Even if the generator is not exhausted.

This can happen under lots of normal circumstances - if it's in a local name that falls out of scope, if it's deled, if its owner gets GCed. But if any live objects (including namespaces) hold strong references to it, it won't get GCed.



回答3:

The Python garbage collector isn't quite that smart. Even though you don't refer to cd any more after that line, the reference is still live in local variables, so it can't be collected. (In fact, it's possible that some code you're using might dig around in your local variables and resurrect it. Unlikely, but possible. So Python can't make any assumptions.)

If you want to make the garbage collector actually do something here, try adding:

del cd

This will remove the local variable, allowing the object to be collected.



回答4:

The other answers have explained that gc.collect() won't garbage collect anything that still has references to it. There is still a live reference cd to the generator, so it will not be gc'ed until cd is deleted.

However in addition, the OP is creating a SECOND strong reference to the object using this line, which calls the weak reference object:

cdw = weakref.ref(cd)()

So if one were to do del cd and call gc.collect(), the generator would still not be gc'ed because cdw is also a reference.

To obtain an actual weak reference, do not call the weakref.ref object. Simply do this:

cdw = weakref.ref(cd)

Now when cd is deleted and garbage collected, the reference count will be zero and calling the weak reference will result in None, as expected.