This invention relates to automatic reclamation of allocated, but unused memory, or garbage, in a computer system that uses a space-incremental garbage collector to process an object space concurrently with the operation of non-collection threads. Memory reclamation may be carried out by a special-purpose garbage collection algorithm that locates and reclaims memory that is unused, but has not been explicitly de-allocated. There are many known garbage collection algorithms, including reference counting, mark-sweep, mark-compact and generational garbage collection algorithms. These, and other garbage collection techniques, are described in detail in a book entitled “Garbage Collection, Algorithms for Automatic Dynamic Memory Management” by Richard Jones and Raphael Lins, John Wiley & Sons, 1996.
However, many of the aforementioned garbage collection techniques often lead to long and unpredictable delays because normal processing must be suspended during the garbage collection process (called “stop the world” or STW processing) and these collectors at least occasionally scan the entire heap. The garbage collection process is performed by collection threads that perform collection work when all other threads are stopped. Non-collection threads perform tasks for the application. Therefore, they are generally not suitable in situations, such as real-time or interactive systems, where non-disruptive behavior is of greatest importance. Conventional generational collection techniques alleviate these delays somewhat by concentrating collection efforts on small memory areas, called “young” generations in which most of the activity occurs. This concentration reduces the need for collecting the remaining large memory area, called the “old” or “mature” generation and, thus, reduces the time consumed during garbage collection, but does not eliminate it.
When the mature generation is eventually collected, many generational techniques lead to pauses in normal operation that, while less frequent, are still highly disruptive. One approach to eliminate these long pauses is to apply a space-incremental technique to regions in the heap containing older objects. Space-incremental collection techniques allow a subset of objects in the heap to be collected and evacuated independently of the rest of the heap. A given subset consists of one or more possibly noncontiguous regions and forms the collection set. Examples of such techniques include the Train algorithm as described in “Incremental Collection of Mature Objects”, R. L. Hudson, J. E. B. Moss, Proceedings of the International Workshop on Memory Management, volume 637 of Lecture Notes in Computer Science, St. Malo, France, pp 388-403, 1992, Springer-Verlag, London, Great Britain; the Garbage-first algorithm as described in “Garbage-First Garbage Collection”, D. Detlefs, C. Flood, S. Heller, A. Printezis, Proceedings of the 4th International Symposium on Memory Management, pp 37-48, 2004 and other techniques allowing partial compaction of the heap as described in “An Algorithm for Parallel Incremental Compaction”, O. Ben-Yitzhak, I. Goft, E. K. Kolodner, K. Kuiper, V. Leikehman, Proceedings of the 3rd International Symposium on Memory Management, pp 100-105, 2002.
As an example, the Train algorithm divides the generation's memory into a number of fixed-sized regions, or car sections and orders the car sections. During at least some of the collection pauses, one or more of the cars lowest in the overall order are collected; these form the collection set. Using this ordering and careful placement policies, the algorithm allows the size of collection sets to be bounded to achieve acceptable pause times even as it guarantees that unreachable data structures too large to fit into a collection set will be isolated in single trains, and once there, reclaimed as a group. During the operation of the algorithm, objects in a car may be evacuated, or relocated, to other cars. When an object is relocated, references to that object located outside of the collection region must be changed to point to the new object location. To facilitate the step of finding the references to objects in a region, many space-incremental collectors use “remembered sets.” In particular, a remembered set is associated with each region and tracks memory locations containing references that point to objects in that region. Memory locations are used instead of the references themselves for two reasons. First, if the referent moves, the referring pointers must be located so that they can be updated to point to the new location. Secondly, since remembered sets are built over time, when the time arrives to use the references they may be stale (no longer point to objects in a region associated with that remembered set). Remembered sets may be implemented with various data structures that can be used to represent a set, such as sets, bitmaps, hash tables and bags. In addition, different collection techniques may have different kinds of remembered sets. For example, generational collectors typically keep track of old generation to young generation references, but do not track young generation to old generation references. They either collect the smaller young generation as part of the collecting the old generation, or scan all its objects searching for such pointers. The Train-algorithm associates a remembered set with each region, identifying references into the region from locations outside the region.
When live objects are evacuated from regions in the collection set, the live objects are identified by finding references from outside the collection set into it. These references may come from root locations outside the heap, or from objects in regions outside the collection set. The remembered sets of the collection set regions enable the identification of pointers from elsewhere in the heap, so that live objects are identified and pointers to them are updated. In addition, the evacuated objects are scanned in their new locations and their references to objects in other regions or generations are duly recorded in the appropriate remembered sets or data structures.
Since non-collection threads are generally suspended during collection using space-incremental collectors, the collectors can still introduce considerable delays during the collection process. One of the largest sources of these delays is the time taken to scan the remembered sets, because the sets can be quite large and all references must be scanned.
In some algorithms, remembered set size can be reduced because the collection sets are collected in a fixed order. For example, for reasons related to achieving collection completeness (the property that all garbage is eventually collected), the Train algorithm maintains a total order on regions, and always collects the “oldest” region before younger regions. This order allows a remembered set for a region to only record references from regions younger in the collection order because regions that are older in the collection order will be collected before (or at the same time as) that region.
However since the region collection order may have little or nothing to do with the pattern of references from one region to another, for some programs, region collection ordering may greatly decrease the number of remembered set entries, for other programs the number of entries may not change much at all, and for still other programs region collection ordering may eliminate only some intermediate fraction of the remembered set entries. Further, some collectors do not inherently impose a region collection order on the regions. An example of such a collector is the Garbage First collection algorithm discussed above. In this type of collector advantage cannot be taken of region collection ordering that is already present for another reason.