Memory available for task execution is one of the most important resources in a computer system. Therefore, much time and energy has been directed to efficient utilization and management of memory. An important aspect of memory management is the manner in which memory is allocated to a task, deallocated and then reclaimed for use by other tasks. The process that dynamically manages the memory is referred to as the memory manager. The memory that the memory manager manages is referred to as the heap. When a program needs a block of memory to store data, the resource sends a request to the memory manager for memory. The memory manager then allocates a block of memory in the heap to satisfy the request and sends a reference (e.g., a pointer) to the block of memory to the program. The program can then access the block of memory through the reference.
Memory allocation and deallocation techniques have become very important in structured programming and object oriented programming languages. Memory allocated from a heap can be used to store information. Often this information is an instantiated object within an objected oriented paradigm. Conventionally, many programming languages have placed the responsibility for dynamic allocations and deallocation of memory on the programmer. These programming language types are referred to as unmanaged or unsafe programming languages, because pointers can be employed anywhere in an object or routine. In C, C++ and the Pascal programming languages, memory is allocated from the heap by a call procedure, which passes a pointer to the allocated memory back to the call procedure. A call to free the memory is then available to deallocate the memory. However, if a program overwrites a pointer, then the corresponding heap segment becomes inaccessible to the program. An allocated heap segment may be pointed to by several pointers, located on the stack or in another allocated heap segment. When all the pointers become overwritten, the heap segment becomes inaccessible. A program cannot retrieve from or write data to an inaccessible heap segment. These inaccessible heap segments are known as memory leaks.
Furthermore, dynamically allocated storage may become unreachable if no reference, or pointer to the storage remains in the set of root reference locations for a given computation. The “root set” is a set of node references such that the referenced node must be retained regardless of the state of the heap. A node is a memory segment allocated from the heap. Nodes are accessed through pointers. A node is reachable if the node is in the root set or referenced by a reachable node. Similarly, storage associated with a memory object can be deallocated while still referenced. In this case, a dangling reference has been created. In most programming languages, heap allocations are required for data structures that survive the procedure that created them. If these data structures are passed to further procedures or functions, it may be difficult or impossible for the programmer or compiler to determine the point at which it is safe to deallocate them. Memory objects that are no longer reachable, but have not been freed are called garbage.
Due to the above difficulties with reclamation of heap-allocated storage, automatic reclamation is an attractive alternative for dynamic memory management. The automatic identification and reclaiming of inaccessible heap segments is known as garbage collection. Garbage collection methodologies determine when a memory segment is no longer reachable by an executing program either directly or through a chain of pointers. When a memory segment is no longer reachable, the memory segment can be reclaimed and reused even if it has not been explicitly deallocated by the program. Garbage collection is particularly attractive to managed or functional languages (e.g., JAVA, Prolog, Lisp Smalltalk, Scheme). In some circumstances, managed data structures need to be passed to unmanaged code (e.g., a file read API provided by the operating system). In these situations, the unmanaged code is unaware of the managed constraints. Therefore, there must be some mechanism in place to assure that the managed data structures are not moved by the garbage collector, while unmanaged code is manipulating memory managed by the garbage collector.
One conventional mechanism consists of copying the managed data structure into an unmanaged unmoveable memory. The call to unmanaged code is made and then the unmanaged memory is copied back into the managed data structure. This mechanism is inefficient due to the constant copying back and forth of the managed data structure. Another mechanism is that the garbage collector is prevented from running while the unmanaged call is in progress. This mechanism does not work well in multithreaded environments. Another conventional mechanism is to allocate unmoveable managed objects with a special API. However, the creation of these objects is generally slower than other managed objects and is also difficult for the developer to know which objects will be passed to an unmanaged API.