The present invention relates to methods and systems for the automatic reclamation of allocated memory, normally referred to as xe2x80x9cgarbage collection.xe2x80x9d More particularly, the present invention relates to garbage collection that utilizes a technique known as xe2x80x9cmarkingxe2x80x9d of memory portions or objects.
During execution of a computer program or application, memory is allocated to the program to enable various operations. Once allocated, other executing processes cannot use the memory allocated to the program, i.e., memory use is exclusive. When the program has finished using a portion or object of allocated memory, it may explicitly deallocate or release that memory. For example, the program may xe2x80x9cdispose,xe2x80x9d xe2x80x9ckill,xe2x80x9d or xe2x80x9cfreexe2x80x9d the memory via a specific programming code function. Once freed, other applications or processes can then use the memory.
Often however, and for various reasons, the program to which the memory was initially allocated does not release objects of the memory, even after the program is finished using that memory. Moreover, when a program is done using a memory block or object, references to that object no longer exist. Without such a reference, that memory object cannot be reached by the application. Consequently, once the program has finished using the memory, that memory cannot be reused until it is freed by the program or reclaimed by another memory management program and then reallocated to either the initial application or process or to another application or process. Thus, allocated yet unreachable objects are unusable, and are considered to be xe2x80x9cdeadxe2x80x9d or xe2x80x9cgarbage.xe2x80x9d
Typically, the processes of allocating memory objects to applications and reclaiming dead memory objects, i.e., garbage collection, are performed by a memory manager. The memory manager tracks which memory objects are currently allocated to the computer program and which objects are currently available to be allocated through a xe2x80x9cfree listxe2x80x9d that keeps track of available objects. Thus, memory objects that are not on the free list are currently allocated to a computer program or application. Other memory managers do not use a free list, and instead allocate memory using an increasing memory address scheme. It is important to not only reclaim dead memory objects, but also compact the live objects to prevent memory fragmentation.
Consequently, the memory manager determines, based on predetermined thresholds, when to perform garbage collection. A common predetermined threshold is based on the number of allocated objects or objects of memory. That is, the memory manager maintains a running total of the number of allocated objects and once the number of objects exceeds a predetermined threshold number, the memory manager initiates garbage collection. Following the garbage collection, the running total of allocated objects is reduced by the number of reclaimed objects of memory or reset to zero, depending on the particular algorithm used. Once the running total exceeds the predetermined threshold again, the memory manager simply reinitiates the garbage collection process.
There are several common techniques for garbage collection. One technique for garbage collection relates to logically dividing the memory in half and using only one half of the memory at a time. During garbage collection, the memory manager copies the reachable objects to the other half of memory. Following the copying of all marked objects, the memory manager frees the entire first half of memory. Consequently, the previously used memory, which contained both the reachable objects and the garbage objects, becomes free memory and therefore contains only objects that are available to be allocated.
Another common garbage collection technique relates to the mark-and-sweep method, which marks all reachable objects of memory and then sweeps the entire heap to reclaim all unmarked objects of memory by adding the unmarked objects to the free list. When the sweep phase is complete, all garbage objects are now on the free list and available to be reallocated to the computer program.
In some situations, the mark-and-sweep may be preferred over the copying system because it is faster than copying garbage collection since the copying of objects and the updating references simply takes more time than adding a garbage block to a free list. Moreover, the mark-and-sweep technique is often more efficient since it uses the whole memory, as opposed to only half, as used in the copying technique. However, in other situations, the copying technique may be preferred due to the lack of fragmentation. In yet another technique, the mark and sweep technique is improved by adding a defragmentation or compaction step at the end of the sweep or reclamation process.
Each of these techniques, however, suffers a particular drawback based on the time it takes to mark each reachable block. During this marking phase, the system must stop or pause the executing applications processes, and such pauses significantly decrease the performance of the applications. That is, since any executing processes may modify the allocation of memory objects, the memory manager might miss reachable objects if the garbage collection process is running concurrently with other processes. Therefore, the application and other process are stopped for a considerable amount of time while the system checks all the allocated objects, and marks those objects as live.
The actual time it takes to mark the allocated objects varies and is dependent on the number of objects that are reachable. Since many applications may be running at one time and since many applications use a substantial amount of memory, typically the number of memory blocks or objects that are reachable is quite extensive. Consequently, the marking phase consumes a significant amount of time. Unfortunately, during this time period, the applications remain halted, preventing any further processing or interaction with the user.
One solution to this problem has been to perform the marking process as a separate thread running concurrently with the execution of programs and processes while implementing what is known as a xe2x80x9cwrite barrier.xe2x80x9d Using write barriers involved an additional process step performed by any running application or process that modified a memory object of marking each object as modified. In essence, the application was required to mark each object changed by the application as the modification occurred. Thus the garbage collector, during its marking phase, could go back and determine which objects to revisit based on the fact that any modified objects were marked as such by the application. Unfortunately, the additional marking steps performed by the application significantly decreased the performance of the program.
It is with respect to these considerations and others that the present invention has been made.
The present invention relates to a system and method of garbage collection that marks reachable objects concurrently with the processing of an application and operates in combination with a write-watch system that accumulates or logs which memory objects are modified during the marking phase. Using the accumulated modifications, the present invention revisits and marks modified objects while pausing the application for only a brief amount of time. The brief time that the application is stopped or paused is less than the amount of time it would have been paused, had the application been paused for the entire marking phase, yet the execution performance of the application does not suffer since it is not responsible for marking modified objects.
In accordance with certain aspects, the present invention relates to a system and method of marking memory objects for collecting garbage in a computer system by getting program root information, wherein the program root information has object references to memory objects. Next, the invention accumulates modification information using a write watch module wherein the modification information relates to memory objects that were modified following the getting of program root information. Meanwhile, the system and method marks reachable memory objects using the program root information. Following the marking operation, the system and method revisits modified memory objects identified by the accumulated modification information and marks the revisited memory object.
In accordance with other aspects, the system and method repeats the act of revisiting modified memory objects using updated information from the write watch module a predetermined number of times. In another embodiment the information received from the write watch module is tested against a predetermined threshold to determine whether the revisiting process is repeated. The predetermined threshold may be related to the number of allocated objects or, more specifically, a percentage based on the total number of allocated objects of memory.
In accordance with yet other aspects, the system and method uses a tricolor marking system where unvisited memory objects are indicated using a first color, such as white, visited objects are indicated using a second color, such as black, and visited but modified objects are indicated using a third color, such as gray. The modified visited objects are objects that have been modified after being visited a first time. In other embodiments, other marking schemes are used.
In an embodiment, the system and method of the present invention relates to collecting garbage for an executing application by first pausing the executing application. Once paused, a snapshot of a root stack is taken, i.e., by copying object references from relevant stacks to a temporary storage or by setting up markers on the stacks so the stacks can be scanned while active. Next, a write watch module to keep track of modifications in the memory is initiated and then the application is restarted. While the application is running, a first marking act of marking all reachable memory objects using snapshot information is performed. The application is paused after the marking act is complete. Next, a second marking act of marking all reachable memory objects is performed wherein the reachable objects marked by the second marking are identified by the write watch module. Following the second marking act, all unmarked garbage objects are collected and then the application is restarted.
The collecting act, in an embodiment relates to sweeping the memory and adding references to the unmarked memory to a free list. In another embodiment, the collecting act relates to copying all marked objects to a different memory portion. In yet another embodiment the collection act relates to planning the relocation of the marked memory objects, updating object references located within other memory objects, compacting the memory objects to relocate the memory objects, and updating a reference pointer related to an end of the compacted memory.
The invention may be implemented as a computer process, a computing system or as an article of manufacture such as a computer program product. The computer program product may be a computer storage medium readable by a computer system and encoding a computer program of instructions for executing a computer process. The computer program product may also be a propagated signal on a carrier readable by a computing system and encoding a computer program of instructions for executing a computer process.
A more complete appreciation of the present invention and its improvements can be obtained by reference to the accompanying drawings, which are briefly summarized below, to the following detail description of presently preferred embodiments of the invention, and to the appended claims.