Generational garbage collection is a garbage collection (GC) technique that works by dividing the heap into two or more generations, each generation corresponding to a specific object “age”, wherein the age of an object is the time spent since the allocation of the object. Objects are allocated in the youngest generation, and move to older generation when their age grows beyond a given threshold. This “promotion” to the old generation is typically performed upon a collection of the young generation. Programming languages that use a generational collector support the generational hypothesis that most objects die young and thus GC activity is mostly made of young generation collections. These are typically much faster than performing GC over the whole heap since they only require scanning the live objects of the young generation, which is typically a small fraction of the whole heap.
In a multi-tasking environment, assigning a private young generation to each task enables one to clearly account the cost of young generation collections of a single task, while making the time performing a young generation collection proportional only to the number of young lived objects allocated by that task.
FIG. 1 is a high level diagram of a prior art generational garbage collection scheme 100 in which a heap memory for a multi-tasking environment is divided into an old generation memory 110 which is shared by multiple tasks, and young generation memories 120, 122, and 124. Young generation memories 120, 122, and 124 belong respectively to task 150, 152, and 154 of the multi-tasking environment. Threads 130, 132, and 134 execute on behalf of separate, individual tasks and allocate objects in the young generation of their task. Old generation memory 110 contains multiple objects of multiple tasks and a free memory space 114 which does not contain any objects. Young generation memory 120 contains live object 140 which contains a reference to object 112 in old generation memory 110. Young generation memory 120 also contains dead object 142. When garbage collection is conducted on young generation memory 120, the memory allocated to dead objects it contains is released while preserving any live objects within young generation memory 120.
Typically, garbage collection occurs by “stopping the world”, in other words, by suspending all mutator threads at points where all locations to object references are known. This approach, however, has a negative impact on throughput especially in a multi-tasking environment, since the threads of all running tasks are prevented from making progress although only one of these tasks needs scavenging.
Because threads of tasks other than the one performing the young generation collection cannot mutate any of the objects of the scavenging task, it should be possible to allow threads of all tasks other than the collecting task to execute concurrently to the young generation collection. However, allowing collection of the young generation of one task concurrently to the execution of other running tasks raises several problems. One potential source of problems is that threads of other tasks may allocate objects directly into the old generation while the young generation GC determines the set of references from objects in the old generation to objects of the young generation being collected. Allocations directly in the old generation can happen for several reasons: because an object is too large to fit in the young generation; because the object is known to be long-lived by the VM; or because the object can be safely shared across multiple tasks (for instance, an immutable string) in order to optimize space usage. These concurrent allocations may be problematic if the implementation of generational GC uses an imprecise write-barrier implementation, such as those based on card tables. A write barrier is a runtime mechanism that performs (potentially conditionally) an action upon every write to an object. In the case of generational GCs, the write barrier performs some action to keep up-to-date a remembered set of the locations in the old generation that contain a reference to a young generation. The remembered set is used during a young generation GC as part of its set of roots.
One of the most popular implementation of remembered sets uses card tables. This consists of dividing up logically the heap space into regions called cards, whose size is a power of two. An array of card states keeps track of what cards may contain a reference to a young generation. Card states are typically a two-valued byte value: dirty or clean. Given these data structures, a write barrier consists simply of right-shifting the address of the memory location where a reference is to be written to obtain an index to the array of card states, and write the states to dirty. Thus, the write barrier cost between 2 to 4 instructions, depending of the implementation detail in a particular virtual machine and processor architecture. The information recorded in card tables is imprecise with respect to the exact location of reference. So a young generation GC determines its set of roots by scanning the array of card states to find dirty cards, and scan all the objects in the dirty cards for references into the young generation.
Scanning of dirty cards by a young generation GC performing on behalf of one task while allowing other task to perform allocation to the old generation concurrently is problematic because a card may contains one or more object allocated by concurrent task that are not fully initialized yet. This is problematic because linear scanning of a portion of the old generation relies on precisely determining the size of the objects being scanned, which in turn requires objects to be fully initialized (so that their type can be determined, and therefore, their size).
What is needed is a method for determining a consistent end-of-scan position in the old generation memory such that a scavenging task is guaranteed to scan only fully initialized objects.