Suppose you have references A -> B -> C -> D
. When you delete the reference to B
from A
, you're left with an orphaned chain of Objects B -> C -> D
.
Will C
and D
be garbage collected even though there's no way to get to them (since there's no reference to B
)?
I imagine the GC is smart about this and will resolve any such dependencies.
However, I took a look into the source code for the LinkedList
class and found something contrary to this belief. I noticed that when a list is clear()
ed, all of the references to each link are explicitly set to null
, thus making it an O(n)
operation. Is there any reason/benefit for doing so?
That does look a bit peculiar. Maybe the reason that it is explicitly dismantling the list is so that the list is cleared for existing iterators and sublists as well as the parent list.
It is certainly NOT done to make the garbage collection faster. A garbage collector doesn't traverse the references in an unreachable object, so nulling them won't make any difference.
UPDATE
A more recent version of the method has these comments:
// Clearing all of the links between nodes is "unnecessary", but:
// - helps a generational GC if the discarded nodes inhabit
// more than one generation
// - is sure to free memory even if there is a reachable Iterator
So, it appears that there is an benefit for the GC, at least in some cases.
Suppose that a Node
in an older generation contains a reference to an object (e.g. a Node
or an element) in a younger generation. That reference becomes a "root" when collecting of the younger generation, causing the young generation object to be retained, even if the old generation Node
is unreachable. This state continues until the older generation is collected. Old generations are collected infrequently.
If you traverse the list and dismantle it, the variable containing the old -> new reference is assigned a null
. The write-barrier for that assignment causes (immediately or at GC time) the original reference to no longer be a "root". Thus, the object in the younger generation can now be collected, and it doesn't end up "tenured" to an older generation (which brings forward the time when that generation needs to be collected).
Presumably, the GC benefits outweigh the cost of unpicking the list ... either on average, or in cases where the costs are disastrous.
For more information, refer to "Garbage Collection algorithms for dynamic memory management" by Jones and Lins. It is in chapter 7.5 in my (first edition) copy.
Generally speaking, it is better to throw a Collection
object away and start again than it is to clear it for reuse.
Yes, C and D will be garbage collected, assuming that B is the only thing that referenced them. This is because they are not reachable from the graph to the root objects of the application object graph.
I imagine the reason for marking each link to null
in the LinkedList
implementation is to prevent a memory leak. It is possible for something outside of the LinkedList
to grab a hold of the head node. If this were to happen, it would keep all the other nodes alive even after the LinkedList
has been cleared.