1. Field of Invention
The invention relates generally to the management of dynamically allocated memory in a computer system. More particularly, the invention relates to tracking references between separate sections of memory associated with a computer system such that automatic storage-reclamation may be performed locally.
2. Description of the Relevant Art
The amount of memory associated with a computer system is typically limited. As such, memory must generally be conserved and recycled. Many computer programming languages enable software developers to dynamically allocate memory within a computer system. Some programming languages require explicit manual deallocation of previously allocated memory, which may be complicated and prone to error. Languages which require explicit manual memory management include the C and C++ programming languages. Other programming languages utilize automatic storage-reclamation to reclaim memory that is no longer necessary to ensure the proper operation of computer programs which allocate memory from the reclamation system. Such automatic storage-reclamation systems reclaim memory without explicit instructions or calls from computer programs which were previously utilizing the memory.
In object-oriented or object-based systems, the typical unit of memory allocation is commonly referred to as an object or a memory object, as will be appreciated by those skilled in the art. Objects which are in use are generally referred to as "live" objects, whereas objects which are no longer needed to correctly execute computer programs are typically referred to a "garbage" objects. The act of reclaiming garbage objects is commonly referred to as garbage collection, while an automatic storage-reclamation system is often referred to as a garbage collector. Computer programs which use automatic storage-reclamation systems are known as mutators due to the fact that such computer programs can change live memory objects during execution. Computer programs written in languages such as the Java.TM. programming language (developed by Sun Microsystems, Inc. of Palo Alto, Calif.) and the Smalltalk programming language use garbage collection to automatically manage memory.
Objects typically contain references to other objects. As such, an area of computer memory which is managed by a garbage collector will generally contain a set of objects which reference one another. FIG. 1 is a diagrammatic representation of an area of computer memory which contains objects. A managed area of memory 10, which is typically a heap associated with a computer system, includes objects 20. In general, an object 20 may be referenced by other objects 20. By way of example, object 20a has a pointer 24a to object 20b. Object 20c also has a pointer 24b to object 20b.As such, object 20b is referenced by both objects 20a and 20c.
A number of external references into memory 10. As shown, a fixed root 30 which is external to memory 10 includes pointers 34 to objects, e.g., objects 20a and 20c,located in memory 10. All objects, as for example objects 20a-d, which may be reachable by following references from fixed root 30 are considered to be live objects. Alternatively, object 20e, which is not reachable by following references from fixed root 30, is characterized as a garbage object.
Garbage collectors are typically implemented to identify garbage objects such as object 20e.In general, garbage collectors may operate using a number of different algorithms. Conventional garbage collection algorithms include reference counting collectors, mark sweep collectors, and copying collectors, as described in Garbage Collection: Algorithms for Automatic Dynamic Memory Management by Richard Jones and Rafael Lins (John Wiley & Sons Ltd., 1996), which is incorporated herein by reference in its entirety. As will be appreciated by those skilled in the art, during garbage collection, when objects 20 are moved, references to objects 20 must be adjusted accordingly.
It is often beneficial to separate a managed memory area into smaller sections to enable garbage collection to be preformed locally in one area at a time. One memory partitioning scheme is generational garbage collection, in which objects are separated based upon their lifetimes as measured from the time the objects were created. Generational garbage collection is described in more detail in above-referenced Garbage Collection: Algorithms for Automatic Dynamic Memory Management by Richard Jones and Rafael Lins (John Wiley & Sons Ltd., 1996). "Younger" objects have been observed as being more likely to become garbage than "older" objects. As such, generational garbage collection may be used to increase the overall efficiency of memory reclamation.
FIG. 2 is a diagrammatic representation of an interface between a root and memory which is partitioned into a new generation and an old generation. A memory 110, which is typically a heap that is associated with a computer system, includes a new generation 110a and an old generation 110b. A fixed root 114, or a global object which references objects within either or both new generation 110a and old generation 110b, includes pointers 116 to objects 120 in new generation 110a, as shown. Root 114 may be located on a stack, as will be appreciated by those skilled in the art.
Some objects 120 within new generation 110a, as for example object 120a, may also be considered as roots, since object 120a is assumed to be live and includes a pointer 122 to another object 120d. When new generation object 126 is live and to old generation object 128, garbage collection performed in old generation 110b does not generally "collect" object 128. However, if new generation object 126 is dead, a garbage collection performed in new generation 110a will result in old generation object 128 becoming unreachable, since old generation object 128 will not be pointed to by any other object. If old generation object 128 is unreachable, then garbage collection performed in old generation 110b will result in the collection of old generation object 128. It should be appreciated that pointer 130, which points between new generation object 126 and old generation object 128 is considered to be an inter-generational pointer, since pointer 130 spans both new generation 110a and old generation 110b. When pointer 130 points from new generation object 126 to old generation object 128, old generation object 128 is considered to be tenured garbage, as old generation object 128 is not collectable using new generation garbage collection.
Another memory partitioning scheme involves separating memory into smaller areas in order to reduce the amount of time required for a single garbage collection to be performed. Pauses caused by garbage collection often tend to disrupt an associated mutator and are, therefore, undesirable. In some systems, the garbage collector may be able to provide a guaranteed small maximum pause duration. Such garbage collectors associated are known as real-time garbage collectors. In other systems, the garbage collector may attempt to keep pause times small, but may fail to do so in some situations. Garbage collectors which attempt to keep pause times small are known as non-disruptive or incremental garbage collectors.
In order to operate on an individual memory area, a garbage collector must have knowledge of all references into that area. References into an area are referred to as roots for that area. It should be appreciated that roots may include both external references, e.g., fixed roots, and references from other areas of computer memory. Accordingly, garbage collectors generally provide mechanisms for finding and tracking roots, or references.
One method of locating references into a memory area involves scanning through all objects in memory. For most systems, scanning through all objects in memory is prohibitively time-consuming. As such, a more elaborate tracking scheme is often required.
Whenever a mutator stores a reference into an object, additional processing may be implemented in order to track references for garbage collection purposes. This additional processing is known as a write barrier or store check. In order for efficiency of the mutator to be maintained at an acceptable level, the costs associated with the write barrier must be kept as low as possible.
One way of tracking references into a memory area involves maintaining a set which holds all roots for the area. Such a set is generally known as a remembered set for the area, as is described in above-referenced Garbage Collection: Algorithms for Automatic Dynamic Memory Management by Richard Jones and Rafael Lins (John Wiley & Sons Ltd., 1996). The associated write barrier will detect when a reference into an area is stored. Accordingly, the write barrier will insert the location of the reference into the remembered set associated with the area.
FIG. 3 is a diagrammatic representation of pointers between objects in a new generation and objects in an old generation which are tracked using a remembered set. A memory 302 is divided into a new generation 302a and an old generation 302b. A remembered set 304 is used to track pointers 314 which point from old generation objects 310 to new generation objects 312. The address of old generation object 310a is stored in remembered set 304, as old generation object 310a includes pointer 314a to new generation object 312b. Similarly, the address of old generation object 310b, which includes pointers 314b and 314c to new generation objects 312b and 312a, respectively, is also stored in remembered set 304.
Locating roots for garbage collection is straightforward when a remembered set is used, since the remembered set will contain all of the roots. However, the use of a write barrier is often expensive, as excess memory may be required. Also, when a new location is to be inserted into a remembered set, it is possible that the particular location may already be present within the remembered set. Checking the remembered set for duplicate locations prior to inserting a location is expensive. On the other hand, eliminating a check for duplicate locations may cause the remembered set to grow unnecessarily large. As will be appreciated by those skilled in the art, there is generally no upper bound on the number of duplicates which a remembered set may hold. When a reference to a location is to be stored, the location often already contains a previous reference. As a result, a remembered set entry that is associated with the location prior to a storage operation will generally become invalid following the storage operation. Removing an old entry may prove to be a costly operation in some situations, whereas leaving an old entry in place may result the occupation of excess memory by a remembered set.
Another scheme which is often used for tracking references into a memory area is known as card marking. Card marking is described in above-referenced Garbage Collection: Algorithms for Automatic Dynamic Memory Management by Richard Jones and Rafael Lins (John Wiley & Sons Ltd., 1996). In general, card marking involves conceptually dividing memory into relatively small parts called cards. A garbage collector will then allocate an array of bits with one entry per card. When a reference is stored, the write barrier will compute the corresponding card array entry and set the associated bit. This process is known as "dirtying" the card. It should be appreciated that for efficiency reasons, a byte or a word array are often used in lieu of a bit array.
One advantage of using card marking is that the write barrier is relatively cheap, as described in A Fast Write Barrier for Generational Garbage Collectors, by Urs Holzle (OOPSLA/ECOOP '93 Workshop on Garbage Collection in Object-Oriented Systems, October 1993), which is incorporated herein by reference in its entirety. Another advantage of using card marking is that the amount of memory required for the card array is fixed. However, the processing required for locating roots at garbage collection time is often significant, e.g., more processing is required than would be for a system which utilizes a remembered set. The garbage collector only knows approximately where reference stores have occurred. That is, the garbage collector only knows where the card marking array has dirty entries. Once a dirty entry is identified, the corresponding cards must be scanned for roots. Although scanning a card for roots is generally much cheaper than scanning an entire memory, scanning the card is still often costly. As will be understood by those skilled in the art, when a garbage collector locates a root, the corresponding card array entry must be kept dirty in order for the root to be located the next time garbage collection is invoked.
A combined scheme which uses remembered sets and card marking is described in Remembered Sets Can Also Play Cards, by Antony L. Hosking and Richard L. Hudson (OOPSLA/ECOOP '93 Workshop on Garbage Collection in Object-Oriented Systems, October 1993), which is incorporated herein by reference in its entirety. A write barrier will perform card marking as described above. However, at garbage collection time, the garbage collector will construct remembered sets for each card. When garbage collection is completed, the card marking array may then be cleared. Using a combined remembered set and card marking scheme reduces the amount of scanning required for subsequent garbage collections, thereby increasing the overall efficiency of garbage collection processes. However, the implementation of such schemes is often complicated. Therefore, what is desired is a method and an apparatus for efficiently implementing a combined remembered set and card marking scheme.