How do garbage collectors know about references on

2019-03-13 05:37发布

What techniques do modern garbage collectors (as in CLR, JVM) use to tell which heap objects are referenced from the stack?

Specifically how can a VM work back from knowing where the stack starts to interpreting all local references to heap objects?

3条回答
Emotional °昔
2楼-- · 2019-03-13 06:13

In Java (and likely in the CLR although I know its internals less well), the bytecode is typed with object vs primitive information. As a result, there are data structures in the bytecode that describe which variables in each stack frame are objects and which are primitives. When the GC needs to scan the root set, it uses these StackMapTables to differentiate between references and non-references.

CLR and Java have to have some mechanism like this because they are exact collectors. There are conservative collectors like the boehm collector that treat every offset on the stack as a possible pointer. They look to see if the value (when treated as a pointer) is an offset into the heap, and if so, they mark it as alive.

查看更多
兄弟一词,经得起流年.
3楼-- · 2019-03-13 06:13

Interesting documentation on this topic posted up by the .Net team shortly after the they made CoreCLR open source: Stack Walking

查看更多
劫难
4楼-- · 2019-03-13 06:14

Take a look at this Artima article from August 1996, Java's Garbage-Collected Heap; especially page 2.

Any garbage collection algorithm must do two basic things. First, it must detect garbage objects. Second, it must reclaim the heap space used by the garbage objects and make it available to the program. Garbage detection is ordinarily accomplished by defining a set of roots and determining reachability from the roots. An object is reachable if there is some path of references from the roots by which the executing program can access the object. The roots are always accessible to the program. Any objects that are reachable from the roots are considered live. Objects that are not reachable are considered garbage, because they can no longer affect the future course of program execution.

In a JVM the root set is implementation dependent but would always include any object references in the local variables. In the JVM, all objects reside on the heap. The local variables reside on the Java stack, and each thread of execution has its own stack. Each local variable is either an object reference or a primitive type, such as int, char, or float. Therefore the roots of any JVM garbage-collected heap will include every object reference on every thread's stack. Another source of roots are any object references, such as strings, in the constant pool of loaded classes. The constant pool of a loaded class may refer to strings stored on the heap, such as the class name, superclass name, superinterface names, field names, field signatures, method names, and method signatures.

Any object referred to by a root is reachable and is therefore a live object. Additionally, any objects referred to by a live object are also reachable. The program is able to access any reachable objects, so these objects must remain on the heap. Any objects that are not reachable can be garbage collected because there is no way for the program to access them.

The article continues to explore different garbage collection strategies, including reference counting collectors, tracing collectors, compacting collectors and copying collectors.


Though this article is old, it still applies today; not much has really changed. There have been performance improvements to the different collection strategies, but no new major advancements.

The Oracle HotSpot JVM, for example, has a new Garbage-First Garbage Collector which is a copying collector with performance tweaks for multi-core processors and large heap sizes (see this answer for more on the G1 Garbage Collector).

查看更多
登录 后发表回答