Modern systems supporting object-oriented programming typically allocate objects from a region of memory called the memory heap, which may also be referred to as the heap. Heap-allocated objects may comprise references to others heap-allocated object. The resulting interconnected set of objects forms an object graph. A heap-allocated object is said alive if it is reachable from a root set of references, which typically comprises references stored in global variables, registers, or the call stack of the threads of the program. When an object of the heap becomes unreachable because all paths from the root set to the object have been removed, the object can no longer be used. However, the object continues to occupy memory space in the memory heap.
Garbage collection is a process that reclaims this unused memory space and makes it available to accommodate new objects. Generally, garbage collectors trace references of the object graph, starting from the root set, to automatically obtain global knowledge of unused memory space in a methodical way. The part of the program that does useful work, distinguished from that which does the garbage collection, is referred to as “mutator.” From the garbage collector's point of view, the mutator mutates the live part of the object graph, i.e., the part of the object graph that is reachable from the root set.
Most modern trace-based garbage collectors implement some variant of the tri-color marking abstraction, which works as follows. Every heap objects is colored in black, grey or white, thus dividing the heap into three sets. The white set is all the objects that have not been visited and are candidates for having their memory recycled. The black set is all the objects that have been traced and that have no references to objects in the white set; in many implementations the black set starts off empty. The grey set is all the objects that are immediately reachable from live references (i.e., references from the root set or from black objects), but whose references to other objects haven't been inspected yet by the tracing algorithm. In particular, grey objects may reference white objects. The tracing algorithm works by moving objects from the white set to the grey set to the black set, but never in the other direction, as follows.
Tracing the object graph begins with initializing the grey set with all objects referenced directly from references in the root set; all other objects are initially placed in the white set, and the black set is empty. Following this initial step, the second step is to pick an object from the grey set and blacken this object (i.e., move it to the black set) by graying all the white objects it references directly (i.e., move them to the grey set). This step confirms that this object cannot be garbage collected, and also that any objects it references cannot be garbage collected. The second step is repeated until the grey set is empty. When the grey set is empty, all live objects have been identified and are colored black, all the objects remaining in the white set have been demonstrated not to be reachable, and the storage occupied by them can be reclaimed.
Using the tri-color marking abstraction, every object of the object graph belongs to precisely one set. The tri-color marking algorithm preserves an important invariant that states no black object points directly to a white object. This process ensures that the white objects can be safely destroyed once the grey set is empty.
The Java™ platform offers a rich set of features, such as programmable class loading, dynamic linking, reflection, and execution from an architecture-neutral binary form. These features require Java™ Virtual Machine (JVM) implementations to maintain sophisticated data structures describing classes in memory during program execution. These data structures, called class metadata, mirror information encoded in class files as well as additional runtime information needed by various components of a JVM, and in particular, the garbage collector. Class metadata for a single class type comprises several objects that may reference class metadata describing other class types defined by the same or by different class loaders. For example, the virtual table embedded in the class descriptor of one class type may comprise references to metadata describing methods of other class types. Similarly, the constant pool of a class type may include references to metadata describing fields and methods of other classes.
Garbage collectors require intimate knowledge of class metadata, both for collecting the memory heap and for class unloading. Unloading a class comprises freeing memory resources allocated to class metadata that describes the class being unloaded. Class metadata provides the garbage collector with precise locations of references in class instances. They may also hold references to memory heap objects via static variables, or via direct references to the reflection objects that represent themselves (e.g., instances of java.lang.Class). In some cases, class metadata may hold the only path to certain memory heap objects. They may themselves be reachable only from other class metadata, or only from memory heap-allocated objects. Hence, the garbage collector needs to trace class metadata.
For these reasons, it is common for JVM implementations to lay out class metadata in the same way as Java memory heap objects in order to unify their processing during garbage collection. In some cases, like in JVMs implemented in Java, class metadata are Java objects allocated directly in the memory heap. Since class metadata is typically long-lived, JVMs equipped with generational garbage collection often pre-tenure class metadata or store them in a special memory heap area such as the permanent generation of the Java HotSpot™ Virtual Machine (HotSpot VM).
Class metadata consumes a large amount of memory. This consumption issue has prompted several efforts to share the class metadata across applications in multi-tasking environments. A multi-tasking virtual machine (MVM) is an implementation of the JVM capable of running multiple application programs in isolation in a single address space. An MVM may share class metadata across specific class loaders of different tasks while providing tasks with isolated heaps. Specifically, tasks allocate objects in a task-private heap that may be garbage collected independently of other tasks private heaps.
Transparent sharing of class metadata across tasks in a MVM exploits knowledge of the linkset that class loaders can produce. A class loader resolves a symbolic reference (i.e., a class name) to a class type, either by defining itself this class type, or by delegating the resolution of the symbolic reference to another class loader.
Class types are defined from class files, which are binary representation of classes. Each class type is described using class metadata constructed by the JVM from a class file provided by the defining class loader.
The set of all the class files used to define all the possible class types that a class loader can resolve from symbolic references is called the linkset of the class loader. Thus, the linkset produced by a class loader includes both the class files for the class types that the class loader defines, but also the class types defined by other class loaders to which the class loader delegates link resolutions to. It is possible for two class loaders to produce the same linkset. For example, two class loaders that don't delegate to any other class loaders and that define class types from the same jar file produce the same linkset.
Another example is when two class loaders that define class types from the same jar file also delegate to a same third class loader. An MVM may transparently and automatically share the class metadata for class types defined by two class loaders if these produce the same linkset.
Memory management for class metadata contributes substantially to garbage collection costs. In particular, full collections must determine what classes are alive in order to determine the liveness of objects referenced from class metadata only and to decide what class may be unloaded. This determination requires garbage collectors to trace all cross-metadata references, which is often more expensive that tracing the application heap. Running multiple applications in a MVM capable of transparently sharing class metadata across tasks aggravates these costs and makes it difficult to reclaim the memory of classes unloaded by one task independently of others.