Computer systems may manage computer memory dynamically. Dynamic memory management refers to the process by which blocks of memory are allocated temporarily for a specific purpose and then deallocated when no longer needed for that purpose. Deallocated blocks are available for reallocation for another purpose. The process that dynamically manages the memory is referred to as the memory manager. The memory that the memory manager manages is referred to as a "heap." When a program needs a block of memory to store data, the program sends a request to the memory manager. The memory manager allocates a block of memory in the heap to satisfy the request and sends a pointer to the block of memory to the program. The program can then access the block of memory through the pointer.
In the case of programs written in certain languages, such as C++, blocks of memory (a block of memory is often referred to as an object) can be allocated automatically or dynamically. Automatic objects are automatically allocated when a procedure is entered and automatically deallocated when the procedure is exited. Conversely, dynamic objects are allocated by an explicit call to the memory manager and deallocated by either an explicit call to the memory manager or automatically through a technique known as garbage collection. Typically, automatic objects are stored in a stack and dynamic objects are stored in a heap.
A program can only access a dynamic object through a pointer. A pointer is a memory location that contains the address of an object. If the program overwrites the pointer, then the corresponding object becomes "inaccessible" to the program. An object may be pointed to by several pointers. Only when all the pointers are overwritten, or are part of another inaccessible object, does the object become inaccessible. An inaccessible object is one in which the program cannot retrieve data from or write data to. Garbage collection is the process of dynamic memory management that detects and recovers objects that are no longer accessible by the program and thus are candidates for deallocation and subsequent reallocation.
There are two basic techniques for determining whether an object is accessible. The reference-counting technique tracks if an object is accessible by incrementing a pointer count in the object every time a new pointer is set to point to the object and decrementing the pointer count every time a pointer that points to the object is overwritten. The reference-counting technique is expensive. In some implementations, it may take an extra 50 bytes of code for every pointer assignment to increment and decrement the pointer count. Also, it may be difficult to identify inaccessible objects in certain situations. For example, if object A contains a pointer to object B, object B contains a pointer to object A, and the pointer count for object A is 1 and for object B is 1, then objects A and B are effectively inaccessible, even though these pointer counts are not 0.
The other garbage collection technique for determining which objects are accessible does not track pointers to an object. Rather, when more heap memory is needed for allocation, the garbage collector identifies objects which are accessible by checking all the pointers in the program and marking each object to which a pointer points as accessible. When this marking is complete, all objects that are not marked as accessible are inaccessible and available for deallocation. Typically, the garbage collector checks every pointer in the memory (heap and stack). However, it may be very difficult to identify whether a certain memory location contains a pointer or some other value. For example, the following C++ declaration indicates that an object A may be a pointer to an object some of the time and may be an integer value at other times.
______________________________________ union A { long value; object* reference; } ______________________________________
A garbage collector would not know whether object A contains a pointer or an integer unless the program set an indicator with each assignment to this object. This type of tracking is very expensive. Without this type of tracking, if object A contained an integer value that happened to contain the same bit pattern as a pointer to an object, then the garbage collector could not be certain whether it is really an integer or a pointer. A memory location that could be a pointer sometimes and not a pointer other times is referred to as a "may-be-pointer." A "conservative" approach to garbage collection does not track each assignment of a may-be-pointer. Rather, it assumes that object A is a pointer and treats the object to which it may point as being accessible. With this approach, the garbage collector may mark an object as accessible when it is really inaccessible.
Conservative garbage collectors have problems compacting memory. During memory compaction, all accessible objects are typically moved to one end of the heap (the allocated objects) with free space (the deallocated objects) occupying the other end. When the garbage collector moves an object, it must update all pointers to that object to reflect its new location. However, the conservative garbage collector cannot change the value of a may-be-pointer because if the may-be-pointer really contained a non-pointer value, then the garbage collector would be introducing an error into the program. Consequently, conservative garbage collectors typically do not compact memory.
Because garbage collection can be time-consuming, some garbage collectors use a technique called generational collecting. A generational garbage collector divides the objects in the heap into two categories: new objects (recently created) and old objects (not recently created). This technique uses the assumption that recently created objects generally become inaccessible quickly, while older objects typically stay accessible for awhile. Thus, a generational garbage collector deallocates new objects, but not old objects. A generational garbage collector trades off completeness of the garbage collection (some old objects may be inaccessible) for a faster garbage collection process.
FIGS. 1A and 1B show a sample stack and heap before and after garbage collection using a conservative approach. FIG. 1A shows stack 110 and heap 120 before garbage collection. Stack 110 contains five stack entries 111-115 with values of 4, 23, 12, 64, and 16. Heap 120 contains 8 objects 121-128 at locations 0, 4, 8, 12, 16, 20, 24, and 28 and free space 129 at location 32. Stack entry 111 is defined to be a pointer only; stack entries 112-114 may be a pointer or integer; and stack entry 115 is an integer. Stack entry 111 contains a pointer 118 to object 122. Stack entry 112 does not currently contain a pointer because its value, 23, does not correspond to the address of any object. Stack entry 113 may contain a pointer 116 (indicated by the dashed line) since there is an object at location 12. Stack entry 114 does not currently contain a pointer because its value, 64, would point to an object in free space. Stack entry 115 is not a pointer, even though its value, 16, corresponds to the location of an object. Object 122 in heap 120 contains a field 117 that may be a pointer to object 123 or an integer. Objects 121, 125, 126, 127, and 128 are inaccessible because there are no pointers to these objects. Object 122 is accessible because pointer 118 points to it. Objects 123 and 124 may be accessible because stack entry 113 and field 119 may point to these objects.
FIG. 1B shows the stack 110 and heap 120 after garbage collection using a conservative approach. In this example, the conservative garbage collector knows whether a memory location contains a pointer, does not contain a pointer, or contains a may-be-pointer. However, it does not track whether a may-be-pointer currently contains a pointer. Heap 120 contains accessible object 122 and the objects that may be accessible, 123 and 124. The inaccessible objects 121, 124-128 have been reclaimed (deallocated) into free spaces 130 and 131. Object 122 has been moved and its corresponding pointer, stack entry 111, has been modified to point to the new location. Objects 123 and 124 have not been moved because the garbage collection process could not be sure whether stack entry 113 or field 119 was an integer or a pointer.
FIGS. 2A and 2B show a sample stack and heap before and after garbage collection using a generational approach. FIG. 2A shows stack 210 and heap 220 before garbage collection using a generational approach. Stack 210 contains three stack entries 211, 213, 214 that are pointers. Heap 220 contains 8 objects 221-228 and free space 229. Pointer 240 is maintained by the garbage collector and points to the start of the new objects. Stack entry 211 points to object 222; stack entry 213 points to object 227; and stack entry 214 points to object 223. Object 223 contains a pointer 242 to object 224, and object 227 contains a pointer 244 to object 228. Objects 221 through 225 are old objects, and objects 226 through 228 are new objects. Objects 222, 223, 224, 227, and 228 are accessible, and objects 221, 225, and 226 are inaccessible.
FIG. 2B shows stack 210 and heap 220 after garbage collection using a generational approach. Since a generational approach was used, the inaccessible old objects 221 and 225 were not reclaimed (deallocated). Only object 226, which is new and inaccessible, was reclaimed. Stack entry 213 was updated to reflect the new location of object 227, and pointer 244 was updated to reflect the new location of object 228. Free space 230 reflects the reclamation of object 226. Pointer 240 was updated to point to the start of the new objects.