This invention relates generally to garbage collection in a virtual machine in a data processing system and more specifically to snapshot-at-the-beginning write barrier elision during program execution in a virtual machine.
Interpreted languages such as Java allow software developers to write application code in a platform neutral fashion. (Java and all Java-based trademarks and logos are trademarks or registered trademarks of Oracle and/or its affiliates.) A platform neutral implementation is achieved by running the application code in a virtual machine (VM), which hides platform differences and provides a set of common application programming interfaces for interacting with the native machine layer. The application itself compiles into a series of byte codes which are platform independent and can be translated by the host virtual machine. The virtual machine typically also contains a just-in-time compiler (JIT) which converts the byte codes to a dynamically compiled native representation, removing much of the interpretation overhead from the virtual machine. The virtual machine also typically contains a garbage collector. Garbage collection (GC) is well known storage management technique used for automated memory management, found in interpreted programming languages as Java. Some languages, such as Java, require that a garbage collector be present, because there is no explicit language syntax for managing memory.
A typical approach of garbage collection is a Stop-The-World (STW) Mark-And-Sweep collector. The garbage collector completely halts execution of the program, and traces all the live objects (a mark phase), starting from the root set (consisting mostly of threads' stack local objects) and recursively finds the objects pointed to by the root set. After the mark phase is finished, the garbage collector sweeps the heap, for example, visiting each object. When the object was not marked during the mark phase, the garbage collector returns the associated memory to a free memory pool.
During the mark phase a live object is processed through three states represented by white (not visited yet), grey (marked, but objects pointed to are not visited yet—or simply referred as a marked object), and black (a marked object that was scanned, for example, referents have been marked—or simply referred as scanned objects). An efficient implementation typically uses a data structure referred to as a mark map to maintain information about objects being marked during a garbage collection cycle. The mark map is a highly condensed data structure, where one bit of the mark map is dedicated to represent each object on the heap storage. Scanned objects are typically not explicitly tracked. A scanned state is an implicit state that an object traverses during a tracing process.
A stop-the-world style garbage collector has to run a garbage collector cycle to completion before allowing the application to resume, which may introduce undesirable long pauses in the program execution. Stop-the-world garbage collection can be modified to operate in a concurrent or an incremental fashion. The garbage collector would proactively start execution, before the exhaustion of free memory. The mark phase (and possibly sweep phase as well) is performed concurrently or in short interleaved increments relative to the application execution. Since the live set is changing while the garbage collector is performing the mark phase (for example, the object reference graph is changing), additional techniques are required to ensure all live objects are discovered. There are essentially two techniques, based on performing extra checks and operations on each object reference write, often referred to as write barrier (WB). One technique is referred to as an incremental-update technique and the other is a snapshot-at-the-beginning (SATB) technique.
The snapshot-at-the-beginning technique encompasses two conditions. The two conditions ensure all objects that are live at the beginning of the garbage collection cycle and all objects allocated since the beginning of the garbage collection cycle are preserved as a part of the live set at the end of the garbage collection cycle.
The first condition is met by execution of Yuasa style write barrier in which any object reference overwritten pointing to an object that is not marked is remembered for eventual scanning before the end of garbage collection cycle. The second condition is met by marking newly allocated objects. The snapshot-at-the-beginning style of concurrent collector is typically less throughput efficient than an incremental-update collector. The snapshot-at-the-beginning garbage collector creates more floating garbage, and a write barrier is more complex and therefore costly to execute, but has a bounded workload, which is suitable for real-time garbage collectors.
Implementation of the snapshot-at-the-beginning incurs an expense of the write barrier. In the presence of a mark map, checking to determine whether the referent is marked involves several instructions. The operation is somewhat expensive because the program has to visit the mark map during execution that may cause a cache miss. Further the length of the operation, for example in terms of numbers of low level processor instructions, increases the difficulty for a just-in-time compiler to inline code; therefore, a subroutine may be preferred, which introduces extra jump and return instructions one instruction pair for every field write.
When a program deals with objects containing a large number of references, for example arrays, there are certain relatively lightweight operations performed that can be severely impacted by a snap-shot-at-beginning write barrier. An example is array copying, in which contents of one array are copied to another array, a common activity within a Java program. Typically, in a system without a write barrier, the array copy amounts to a simple memory copy. However, in a system with snap-shot-at-beginning write barrier, each element copied must have the destination index checked for an overwrite occurrence before the copy can take place. The checking causes a slowdown by several orders of magnitude as every slot is read, analyzed and processed, which in turn can have tremendous impact on performance.
A general rule for executing a full write barrier check on any reference may be relaxed in different ways. Several techniques successfully speed up, reduce or eliminate the write barrier checks typically addressing special use cases.
In one example, execute a write barrier on each reference slot overwrite, only if garbage collection is active. In another example, execute a write barrier on each reference slot overwrite, but optimize special cases, for example, null overwrite, before invoking a write barrier helper. In another example, in an array copy operation where the destination and source are the same, effectively a shift operation, or any array rearrangement including permutations, sorting and the like, execute a write barrier only on non-overlapping slots. This technique is safe only if there are no other mutators concurrently writing to the array, which is most often the case, but is typically difficult to determine programmatically in run-time. In another example, when newly allocated objects have all reference slots initialized to null, as determined in a static analysis, skipping the write barrier is safe. Sometimes, the use case is trivial, such as in constructors, but occasionally a just-in-time compiler needs non-trivial data flow analysis to determine when a reference slot is initialized to null. In another example, one short lived reference is covered with a long lived reference to the same objects; therefore, writes to the short lived reference can be eliminated. In another example, deferring write barriers to and combining them such that redundant writes to identical locations are eliminated. Therefore, improvements are still required for the concurrent collection style write barrier implementations.