Memory available for task execution is an important resource in computer systems. Typically, considerable time and energy is directed for efficiently employing such memory when running applications. An important aspect of such memory management includes how memory is being allocated to a task, or deallocated and then reclaimed for use by other tasks. In general, the process that dynamically manages memory is referred to as the memory manager, wherein the memory that such memory manager manages is referred to as the heap. When a program needs a block of memory to store data, a resource sends a request to the memory manager for memory allocation. Subsequently, the memory manager allocates a block of memory in its heap to satisfy such request, and sends a reference (e.g., a pointer) for the block of memory to the program. The program can then access such block of memory via the reference.
As such, memory allocation/deallocation techniques have become critical in structured programming and object oriented programming languages. Memory allocated from a heap can be employed to store information, wherein such information can be an instantiated object within an objected oriented paradigm. Many programming languages have placed responsibility for dynamic allocations/deallocation of memory on the programmer. Such programming language types are referred to as unmanaged or unsafe programming languages, because pointers can be employed anywhere in an object or routine. In C, C++ and the Pascal programming languages, memory is allocated from the heap by a call procedure, which passes a pointer to the allocated memory back to the call procedure. A call to free the memory is then available to deallocate the memory.
However, if a program overwrites a pointer, then the corresponding heap segment becomes inaccessible to the program. An allocated heap segment can be pointed to by several pointers, located on the stack or in another allocated heap segment. When all the pointers become overwritten, the heap segment becomes inaccessible. A program cannot retrieve from or write data to an inaccessible heap segment. These inaccessible heap segments are known as memory leaks.
Furthermore, dynamically allocated storage may become unreachable if no reference, or pointer to the storage remains in the set of root reference locations for a given computation. The “root set” is a set of node references such that the referenced node must be retained regardless of the state of the heap. A node is a memory segment allocated from the heap. Nodes are accessed through pointers. A node is reachable if the node is in the root set or referenced by a reachable node. Similarly, storage associated with a memory object can be deallocated while still referenced. In this case, a dangling reference has been created. In most programming languages, heap allocations is required for data structures that survive the procedure that created them. If these data structures are passed to further procedures or functions, it may be difficult or impossible for the programmer or compiler to determine the point at which it is safe to deallocate them. Memory objects that are no longer reachable, but have not been freed are called garbage.
The automatic identification and reclaiming of inaccessible heap segments is known as garbage collection. Garbage collection methodologies determine when a memory segment is no longer reachable by an executing program either directly or through a chain of pointers. When a memory segment is no longer reachable, the memory segment can be reclaimed and reused even if it has not been explicitly deallocated by the program. Garbage collection is particularly attractive to managed or functional languages (e.g., JAVA, Prolog, Lisp Smalltalk, Scheme). For example, the JAVA programming language has the characteristic that pointers can only be provided to reference objects (e.g., the head of an object). Thus, the garbage collection methodologies only need to identify object pointers during automatic reclamation of unreachable memory.
Put differently, garbage collectors automatically reclaim dynamically allocated objects that will not be accessed again by the program. As such, garbage collection is widely acknowledged for supporting fast development of reliable and secure software. It has been incorporated into modern languages, such as Java and C#. Nonetheless, garbage collectors are notoriously hard to verify, due to their low-level interaction with the underlying system and the general difficulty in reasoning about reachability in graphs associated therewith. Such verification can reduce the trusted computing base of a system and increase the system's reliability. For example, the verification can be important for secure systems based on proof-carrying code (PCC) or typed assembly language (TAL); typical large-scale PCC/TAL systems can verify the safety of the mutator (the program), but not of the run-time system that manages memory and other resource on the mutator's behalf. Such prevents untrusted programs from customizing the run-time system. Furthermore, bugs in the unverified run-time systems can result in security vulnerabilities that undermine the guarantees promised by PCC and TAL.