What are the roots in garbage collection?
I have read the definition of root as "any reference that you program can access to" and definition of live is that an object that is being used, which can be a local variable, static variable.
I m little confused with discriminating the difference between root and live objects.
What is path to root? How does root and live objects work?
Can someone elaborate ?
The GC (Garbage Collector) roots are objects special for garbage collector. The Garbage Collector collects those objects that are not GC roots and are not accessible by references from GC roots.
There are several kinds of GC roots. One object can belong to more than one kind of root. The root kinds are:
(credit to YourKit's website)
Not mentioned by YourKit is the fact that objects awaiting finalization will be retained as roots until the GC runs the
finalize()
method. That can cause transient retention of large graphs somewhat unexpectedly. The general rule of thumb is not to use finalizers (but that's a different question).In java I would say threads are the root objects. Every live object can be back traced to a live thread. For example, a static object is referenced by a class, which is referenced by a class loader, which is referenced by another class, which is referenced by an instance of that class, ... which is referenced by a Runnable, which is referenced by a live thread. (Note, classes can be GC'ed, they can't be roots)
We can also consider a "real" root for all threads, however that is out of the realm of standard Java. We can't say what it is, and how it references all the threads.
The IBM web site lists the following as GC roots.
Note that some of these are artificial constructs done by a memory analyzer, but still important to be aware of if you are looking at a heap dump.
System class
A class that was loaded by the bootstrap loader, or the system class loader. For example, this category includes all classes in the rt.jar file (part of the Java runtime environment), such as those in the java.util.* package.
JNI local
A local variable in native code, for example user-defined JNI code or JVM internal code.
JNI global
A global variable in native code, for example user-defined JNI code or JVM internal code.
Thread block
An object that was referenced from an active thread block.
Thread
A running thread.
Busy monitor
Everything that called the wait() or notify() methods, or that is synchronized, for example by calling the synchronized(Object) method or by entering a synchronized method. If the method was static, the root is a class, otherwise it is an object.
Java local
A local variable. For example, input parameters, or locally created objects of methods that are still in the stack of a thread. Native stack
Input or output parameters in native code, for example user-defined JNI code or JVM internal code. Many methods have native parts, and the objects that are handled as method parameters become garbage collection roots. For example, parameters used for file, network, I/O, or reflection operations.
Finalizer
An object that is in a queue, waiting for a finalizer to run.
Unfinalized
An object that has a finalize method, but was not finalized, and is not yet on the finalizer queue.
Unreachable
An object that is unreachable from any other root, but was marked as a root by Memory Analyzer so that the object can be included in an analysis.
Unreachable objects are often the result of optimizations in the garbage collection algorithm. For example, an object might be a candidate for garbage collection, but be so small that the garbage collection process would be too expensive. In this case, the object might not be garbage collected, and might remain as an unreachable object.
By default, unreachable objects are excluded when Memory Analyzer parses the heap dump. These objects are therefore not shown in the histogram, dominator tree, or query results. You can change this behavior by clicking File > Preferences... > IBM Diagnostic Tools for Java - Memory Analyzer, then selecting the Keep unreachable objects check box.
Java stack frame
A Java stack frame, which holds local variables. This type of garbage collection root is only generated if you set the Preferences to treat Java stack frames as objects. For more information, see Java Basics: Threads and thread stack queries.
Unknown
An object of unknown root type. Some dumps, such as IBM Portable Heap Dump (.phd) files, do not have root information. In this case, the Memory Analyzer parser marks objects that have no inbound references, or are unreachable from any other root, as unknown. This action ensures that Memory Analyzer retains all the objects in the dump.
Roots or garbage collection roots are the objects that are always reachable. If an object is always reachable, then it is not eligible for garbage collection; roots therefore are always ineligible for collection. It is the initial set of objects from where reachability of all other objects on the heap are determined.
Other objects on the heap reachable from the garbage collection roots are considered to be live objects, and ineligible for collection; the objects that are unreachable can be marked for reclamation.
I know Java more than the .Net platform, so I'll speak only for one. On the Java platform, the GC roots are actually implementation dependent. In most runtime however, the GC roots tend to be the operands on the stack (for they are currently in use by threads) and class (static) members of classes. Reachability is calculated from these objects in most JVMs. There are other cases where local parameters and operands used by JNI calls will be considered part of the root set, and also used to calculate reachability.
I hope this clears any lingering doubts over what is a root (set) and what is a live object.
If you think of the objects in memory as a tree, the "roots" would be the root nodes - every object immediately accessible by your program.
There are four objects; a person, a red car, its engine and horn. Draw the reference graph:
And you'll end up with
Person
at the "root" of the tree. It's live because it's referenced by a local variable,p
, which the program might use at any time to refer to thePerson
object. This also goes for the other objects, throughp.car
,p.car.engine
, etc.Since
Person
and all other objects recursively connected to it are live, there would be trouble if the GC collected them.Consider, however, if the following is run after a while:
And redraw the graph:
Now the
Person
is accessible throughp
and the blue car throughp.car
, but there is no way the red car or its parts can ever be accessed again - they are not connected to a live root. They can be safely collected.So it's really a matter of taking every starting point (every local variable, globals, statics, everything in other threads and stack frames) — every root — and recursively following all the references to make up a list of all the "live" objects: objects which are in use and unsuitable for deletion. Everything else is garbage, waiting to be collected.