The present invention relates to the management of memory in computer systems, and more particularly to a system and method for automatic management of memory employing a garbage collector.
Memory available for task execution is one of the most important resources in a computer system. Therefore, much time and energy has been directed to efficient utilization and management of memory. An important aspect of memory management is the manner in which memory is allocated to a task, deallocated and then reclaimed for use by other tasks. The process that dynamically manages the memory is referred to as the memory manager. The memory that the memory manager manages is referred to as the heap. When a program needs a block of memory to store data, the resource sends a request to the memory manager for memory. The memory manager then allocates a block of memory in the heap to satisfy the request and sends a reference (e.g., a pointer) to the block of memory to the program. The program can then access the block of memory through the reference.
Memory allocation and deallocation techniques have become very important in structured programming and object oriented programming languages. Memory allocated from a heap can be used to store information. Often this information is an instantiated object within an objected oriented paradigm. Conventionally, many programming languages have placed the responsibility for dynamic allocations and deallocation of memory on the programmer. These programming language types are referred to as unmanaged or unsafe programming languages, because pointers can be employed anywhere in an object or routine. In C, C++ and the Pascal programming languages, memory is allocated from the heap by a call procedure, which passes a pointer to the allocated memory back to the call procedure. A call to free the memory is then available to deallocate the memory. However, if a program overwrites a pointer, then the corresponding heap segment becomes inaccessible to the program. An allocated heap segment may be pointed to by several pointers, located on the stack or in another allocated heap segment. When all the pointers become overwritten, the heap segment becomes inaccessible. A program cannot retrieve from or write data to an inaccessible heap segment. These inaccessible heap segments are known as memory leaks.
Furthermore, dynamically allocated storage may become unreachable if no reference, or pointer to the storage remains in the set of root reference locations for a given computation. The xe2x80x9croot setxe2x80x9d is a set of node references such that the referenced node must be retained regardless of the state of the heap. A node is a memory segment allocated from the heap. Nodes are accessed through pointers. A node is reachable if the node is in the root set or referenced by a reachable node. Similarly, storage associated with a memory object can be deallocated while still referenced. In this case, a dangling reference has been created. In most programming languages, heap allocations is required for data structures that survive the procedure that created them. If these data structures are passed to further procedures or functions, it may be difficult or impossible for the programmer or compiler to determine the point at which it is safe to deallocate them. Memory objects that are no longer reachable, but have not been freed are called garbage.
Due to the above difficulties with reclamation of heap-allocated storage, automatic reclamation is an attractive alternative for dynamic memory management. The automatic identification and reclaiming of inaccessible heap segments is known as garbage collection. Garbage collection methodologies determine when a memory segment is no longer reachable by an executing program either directly or through a chain of pointers. When a memory segment is no longer reachable, the memory segment can be reclaimed and reused even if it has not been explicitly deallocated by the program. Garbage collection is particularly attractive to managed or functional languages (e.g., JAVA, Prolog, Lisp Smalltalk, Scheme). For example, the JAVA programming language has the characteristic that pointers can only be provided to reference objects (e.g., the head of an object). Thus, the garbage collection methodologies only need to identify object pointers during automatic reclamation of unreachable memory. Therefore, it is illegal to provide an interior pointer to reference a data member or field inside an object. However, unmanaged languages such as C and C++ allow interior pointers during development and execution of code. An interior pointer is a pointer into the heap that does not point at the base of an object. The garbage collection methodologies need to find all data reachable from an object including a description of its fields. Objects have these types of descriptions, which are reachable if the base of the object is known. However, when an interior pointer is received it is unknown where the base of the object is containing the data referenced by the interior pointer. Finding the base of the object is a relatively cumbersome and expensive operation. Therefore, unmanaged languages do not provide for automatic reclamation of dynamically allocated memory and some level of memory leaks inevitably go undetected despite high quality programming.
The following presents a simplified summary of the invention in order to provide a basic understanding of some aspects of the invention. This summary is not an extensive overview of the invention. It is intended to neither identify key or critical elements of the invention nor delineate the scope of the invention. Its sole purpose is to present some concepts of the invention in a simplified form as a prelude to the more detailed description that is presented later.
A system and method is provided for executing both managed and unmanaged code in a managed environment and managing memory employing a garbage collection system or service. Managed code is code that manipulates objects that were allocated in the heap. Managed code is required to have a mechanism for enumerating all the live garbage collector pointers currently in use. Unmanaged code is code that does not manipulate garbage collector pointers and does not need to have a garbage collector enumeration mechanism. The code may be precompiled, compiled in real-time or interpreted. The system and method identify roots including object references and interior references on a stack. The object references and interior references are then reported to a garbage collection system or service. The garbage collection system or service employs the object references and interior references when tracing the heap for objects and data members (e.g., integer numbers, floating values, character fields, other objects) within the objects. Memory that is inaccessible is then reclaimed for assignment to other objects. The garbage collection system or service may be invoked periodically by an operating system, a memory manager or some other service. Alternatively, the garbage collection system can be invoked in response to a request for memory by an executing program. The garbage collections system can also be invoked when the heap becomes full.
In one aspect of the invention, a system and method is provided for identifying interior references (e.g., pointers) during execution of code in a run-time environment. Code is executed by a compiler (e.g., Just-In-Time compiler) and object references and interior references are stored in a process stack. The code can include both managed and unmanaged code. The interior references (e.g., references within an object) are created and stored on the stack by calls within the code. For example, a call to modify a field or data member within an object may be made within the code. In response, the compiler creates an interior reference, which is stored in the process stack. Dynamic memory management is employed periodically to cleanup dead or unused objects from the stack and/or a heap containing globally shared objects or the like. The dynamic memory management can be invoked in a response to a memory request. A code manager then scans the stack and passes both the object references and the interior references to a garbage collector. The garbage collector then employs both the object references and interior references to reclaim the storage allocated to objects that are no longer alive.