Computer systems may manage computer memory dynamically. Dynamic memory management refers to the process by which blocks of memory are allocated temporarily for a specific purpose and then deallocated when no longer needed for that purpose. Deallocated blocks are available for reallocation for another purpose. The process that dynamically manages the memory is referred to as the memory manager. The memory that the memory manager manages is referred to as a "heap." When an application program needs a block of memory to store data, the program sends a request to the memory manager. The memory manager allocates a block of memory in the heap to satisfy the request and sends a pointer to the block of memory to the program. The program can then access the block of memory through the pointer.
In the case of programs written in certain languages, such as C++, blocks of memory can be allocated automatically or dynamically. Automatic memory blocks are automatically allocated when a procedure is entered and automatically deallocated when the procedure is exited. Conversely, dynamic memory blocks are allocated by an explicit call to the memory manager and deallocated by either an explicit call to the memory manager or automatically through a memory reclamation technique known as garbage collection. Typically, automatically allocated memory blocks are stored in a stack and dynamically allocated memory blocks are stored in a heap.
A program can only access a dynamic memory block through a pointer. A pointer is a memory location that contains the address of an allocated (or used) heap segment. If the program overwrites a pointer, then the corresponding heap segment becomes "inaccessible" to the program. An allocated heap segment may be pointed to by several pointers, located on the stack or in another allocated heap segment. Only when all the pointers are overwritten, or are part of another inaccessible heap segment, does the heap segment become inaccessible. A program cannot retrieve data from or write data to an inaccessible heap segment. These inaccessible allocated heap segments are known as memory leaks.
For efficient memory usage, inaccessible heap segments must be "reclaimed" so that they can be reallocated. The identification and reclaiming of inaccessible heap segments is known as garbage collection. A system or method for performing garbage collection is referred to as a garbage collector. The first garbage collector developed was the reference counting algorithm. It is based on counting the number of active references to dynamically allocated objects to determine which objects are inaccessible. The mark/sweep algorithm was developed at about the same time. Mark/sweep is a tracing algorithm and relies on global traversal of all allocated objects to determine the inaccessible objects.
Another garbage collector is the copying algorithm. This collector divides the heap equally into two parts. One part contains current data and the other old data. The algorithm traverses all active objects in the current data part, copies them to the other part, leaving the inactive objects uncopied. After all active objects have been traced, the roles of the two parts are swapped. Generational and incremental garbage collectors are designed to improve the performance of garbage collection, making it feasible in real-time applications. The idea behind generational garbage collection is that most objects die young. The algorithm segregates objects by age in two or more generations of heap, and concentrates effort on reclaiming those objects most likely to be garbage, i.e., young objects. Incremental algorithms decrease the length of garbage collection pauses by interleaving small amounts of collection with the real-time application program's execution. Many different embodiments of these collectors can be found in the prior art. However, implementing these collectors requires recompiling or relinking the application. Thus there is a need for a system and method for dynamic memory reclamation without recompiling or relinking the application program.
Servers in large scale information retrieval systems cannot long endure even small memory leaks because of their massive input/output demands. Memory leaks are significantly enhanced because information retrieval systems demand massive caching of massive data. A small memory leak per transaction could exhaust the available memory in a matter of days. Many existing information retrieval systems are implemented in C or C++. These languages do not automatically reclaim dynamically allocated memory, and some level of memory leaks inevitably go undetected despite high quality, meticulous programming. Thus, there is a need for a system and method for reclaiming memory leaks without programmer assistance.
There are many available tools to perform memory reclamation in the prior art. These tools require at least one of the following: compiler assistance, source code instrumentation, add-on utilities that must be linked with the application, and replacing elements in the language memory manager. Compiler assistance and source code instrumentation can be very effective in garbage collectors, however, source code and/or a build environment is often not available for a particular application. Relinking applications to include add-on utilities may change the execution of an application, asking previously manifest defects or introducing new ones. Thus, there is a need for a system and method for reclaiming memory leaks without recompiling or relinking an application program.