The present invention relates to computer programming and execution of computer programs, and, more specifically, to a concurrent mark-sweep garbage collector for use with programming languages such as Java.
Interpreted programming languages such as Java allow software developers to write application code in a platform independent manner. This is typically achieved by having the application program run on a virtual machine (VM), which hides platform differences and provides a set of common application program interfaces (API) for interacting with the native layer. The application program itself compiles down to a series of “bytecodes”, which are platform independent and can be translated by the host VM. The VM typically also contains a just-in-time (JIT) compiler, which converts the bytecodes to a dynamically compiled native representation, removing much of the interpretation overhead from the VM.
The VM also typically contains a garbage collector. Garbage collection (GC) is a well-known technique used for automated memory management, and is found in interpreted or scripted programming languages as Java, Ruby and Lisp. Some languages, such as Java, require a garbage collector, as there is no explicit language syntax in Java for managing memory in the object lifecycle. The GC automatically reclaims garbage, or memory, used by objects that will never be accessed or mutated again by the application program. This freeing up of memory performed automatically by the GC spares the programmer from the burden of having to manually manage memory. However, drawbacks to many GC implementations include the unpredictable occurrence and length of pause times in the application to complete the GC process.
Real-time garbage collection (RTGC) focuses on providing short, deterministic pause times with certain levels of guarantees on the rate of interruption and interference of the GC on a running application program. A traditional approach is to break the garbage collection process up into a number of individual pauses in the application program in which the work to complete a GC cycle is performed. These incremental pauses are metered out according to quality of service guarantees between the collector and the application. In an even more advanced system, not all application threads have to be stopped, so that each thread individually decides when to pause and does the metering. Also there could be additional (background) GC threads that run fully concurrent with the application threads, so that interleaving with the application thread is done on the operation system level. This concurrent and/or incremental approach is in contrast to a more standard GC approach, which interrupts the application once for the entire GC cycle in which GC memory reclamation work is started and completed. The technique of stopping the application for the entire GC cycle is typically known as a stop-the-world (STW) approach, which halts the entire Java application while GC work continues and is completed.
Real-time garbage collection typically occurs in small increments, and allows the Java application to run between the increments or concurrently with the increments. Since Java applications are allowed to run, the shape of the heap (i.e., object references to other objects) will change as the application program runs. This requires a tracking/communication mechanism between the Java VM and the GC so that the collector prevents the missing of any objects during its tracing and collecting operations, for example, where the GC thought those objects were live but were in fact dead (e.g., dangling pointers). For performance reasons and determinism in amount of tracing work, RTGC often uses a Yuasa-style snap-shot-at-the-beginning (SATB) style barrier on stores and loads of object references in order to track these changes. The SATB treats the beginning of the collection cycle as a “picture” or “snap-shot” of the heap; that is, everything that is live at that precise moment will survive the collection, and everything that is dead will die. This is similar to a standard STW collector. The SATB uses a write-barrier, which tracks all references written into objects; specifically, it will track objects that are overwritten in order to preserve their “live” status at the start of the GC cycle, and will have the GC process them at some point during the cycle.
Along with having the barrier to keep the SATB view in place, RTGC also keeps track of newly created objects. Since these are created after the SATB, they are viewed as “live” for the duration of that GC cycle, regardless of when they actually die. This means the collector must also have a system for keeping these newly created objects alive that meshes with its standard tracing system. There is another concurrent style of collector that is not based on an SATB approach. Such collector incrementally updates the view of the live object tree, again executing a write barrier, but of a different style. Such collector need not keep track of newly allocated objects, since it is implicitly covered by the incremental update of the live set. This style of the collector is however, not suitable for RTGC environment since it is not as deterministic as the SATB style of collector.
In order to keep pause times low in RTGC, certain concessions are made to the functionality. Compaction of the heap often has the biggest impact on the size and determinism of GC pause times. The purpose of compaction is to reduce/eliminate fragmentation within the object heap, allowing objects of varying sizes to be allocated, and prevent out of memory conditions. Although some RTGC implementations are built to support compaction, there is a cost on throughput speed associated with being able to perform compaction in an incremental manner. As well, it may not be possible to perform the compaction—native references to memory within the heap may prevent an object from being moved, or the increase in length of a GC cycle because the effort of a compaction may result in the heap being completely consumed before the GC process completes. To avoid compaction (and thus fragmentation), an RTGC can make use of a segregated heap that partitions the heap up into size classes for various types of objects. In this case, small holes due to fragmentation are always of a known size (depending on the area of the heap) and so are relatively easily used for allocating objects that fit.
Further, allocation performance is critical to having an effective and successful RTGC, as the garbage collector subsystem has a relatively large effect and is responsible for the speed and method with which objects are procured from the heap. Allocation speed can often be the primary bottleneck of any GC implementation, with the typical problem being lock contention on the free list, though there may be other issues, including path length. The standard solution is to provide a per thread local cache of available heap memory such that a thread can allocate an object in a small number of instructions without contending with other threads.