Garbage Collection (“GC”) in computer science refers to the automated management of storage resources (typically dynamically allocated memory) by distinguishing allocations that are in use from allocations that are no longer referenced, which can be reclaimed for reuse. The purpose of any garbage collection system is to simplify software design by freeing software developers from explicitly managing object lifetimes, doing so in a way that impacts runtime performance as little as possible. Garbage collectors allow software developers to focus their efforts on the intricacies of their software's purpose rather than be burdened with the bookkeeping associated with dynamic object allocation and release.
Although GC designs vary, there are generally two design approaches: automatic reference counting (“ARC”) and object graph analysis (“OGA”). In general, ARC implementation is straightforward but has inherent limitations while OGA offers a complete solution but comes at the cost of additional runtime overhead. While ARC offers most of the conveniences of a fully garbage collected environment, it requires designers expend significant effort to prevent reference cycles (e.g., object A references B, which references C, which references A). So while ARC can be effective and ease software development, it shifts the burden of memory management from explicit, low-level idioms to higher-level idioms that still require developers to track how and where object references are stored. Thus, developers must expend time and attention ensuring that reference cycles are not formed.
Alternatively, OGA determines which objects are no longer in use by determining which objects are no longer reachable from a set of “root” objects that are known to be anchor points for all objects in use at any given time. The object graph is analyzed in an automated fashion where the GC concurrently traverses the “live” object graph and then repeats the process again and again in successive cycles while user threads concurrently execute. When the GC determines that an object is no longer reachable from the total set of root objects, it infers that the object's allocation is no longer in use and can be reclaimed. OGA importantly does not suffer ARC's reference cycle blind-spot but can potentially degrade runtime performance as the number of objects in the live object graph grows and must be repeatedly traversed.
OGA is regarded as a “true” garbage collection since it completely frees developers from managing object lifetimes and how and where object references are stored. However, OGA load must be amortized evenly so that active user threads remain responsive and do not experience uneven spurts of latency. OGAs that efficiently and robustly support a multithreaded runtime environment are of significant academic interest because of the inherent complexity and design tradeoffs involved, especially when one considers the operational requirements of commercial and scientific computing scenarios which rely so heavily upon them.
“Tri-color mark and sweep,” frequently outlined in academic material, is an OGA algorithm that marks objects with “colors” (i.e., meta states) that indicate whether each object is known to be:
(a) in use (i.e., referenced directly by a GC root or indirectly by other objects in use),
(b) in use but contains object references requiring sub-traversal, or
(c) no longer in use (i.e., not referenced by any objects currently in use)
The “mark phase” of each GC cycle begins by marking the GC roots as (b) and ends when there are no more objects requiring further analysis/traversal. After the mark phase is complete, objects not explicitly marked as being “in use” are therefore unreachable and are implicitly (c) and can be reclaimed. Although tri-color mark and sweep has well-acknowledged success in single-threaded environments, it does not translate well to a multithreaded environment where alterations to the object graph caused by user thread activity occur concurrently with the GC performing OGA. To see the problem, consider a multithreaded environment where the GC happens to traverse an area of the object graph that is simultaneously being altered by a concurrently executing user thread. Inevitably, an object will inadvertently “escape” GC traversal and will therefore mistakenly be regarded as unreachable (and thus be eligible for collection). When the garbage collector attempts to reclaim an escaped object, an internal halt, crash, or data corruption will occur later when the “dangling” object reference is later followed to the now-reclaimed memory region. Note how this issue occurs because the GC witnesses the object graph state during transition due to concurrent user thread activity, not because the principles of live object reachability are flawed.
To address the multithreaded hazards of tri-color mark and sweep, developers add synchronization mechanisms in order to prevent user threads from altering the object graph in ways that could otherwise cause objects to escape and be mistakenly reclaimed. However, frequent and repeated synchronization adds additional runtime overhead that degrades overall user/client code performance. Examples of synchronizing mechanisms are: OS-level synchronization objects, busy-waiting, object transactional analysis, and memory barriers/tripwires that signal when object graph re-marking is required. Some concurrent mark and sweep approaches use more than three colors/states but still require, and are characterized by, frequent and repeated synchronization invocation in order to ensure GC traversal correctness. More specifically, in conventional concurrent mark and sweep GCs, the total sum of synchronization overhead in a GC traversal cycle is proportional to object graph alteration activity (i.e. “Order N” running time). In contrast, a GC is more attractive if the total synchronization overhead for each traversal cycle is fixed and does not depend on user thread activity (i.e., “Order 1” running time). Conventional concurrent OGA implementations are characterized by using synchronization mechanisms to gate object graph alterations that grow with user thread activity.