1. Technical Field
The present invention relates in general to memory management within data processing systems and in particular to de-allocation and coalescing of free space memory within data processing systems. Still more particularly, the present invention relates to mark-sweep garbage collection memory management for object oriented programs within data processing systems.
2. Description of the Related Art
"Objects," when considered in the field of computer programming and operation, refer to a defined programming unit that is usually one of many building blocks of object oriented programs. An object consists of data and operations (termed methods) that can be performed on that data. One advantage of object-oriented programming over standard, procedural programming, is that objects allow programmers to create modules that do not need to be changed when a new type of object is added. Java.TM., a trademark of Sun Microsystems, Inc. of Palo Alto, Calif., is one example of an object-oriented programming language.
Objects literally occupy discrete regions of memory and are not permitted to overlap. In other words, an object has a defined area in memory that it does not share with any other object. There are different states of objects including "live" objects that are currently being utilized in an ongoing process. Live objects have lifetimes that may be accounted for in the program and die only when the last reference to the object disappears. "Dead" objects, or objects that are not live, are objects whose lifetimes have expired and will not be utilized further in the computational process. "Heap" refers to that portion of memory that is reserved for a program to utilize when storing data whose size or existence may not be determined ahead of time.
"Garbage" refers to data stored in memory that is no longer being utilized. "Garbage collection" is the automatic detection and freeing of memory that is no longer in use. Object oriented programming usually includes memory management functions that identify and free up memory within the heap that is not being utilized. Performing the function of garbage collection eliminates the need for programmers to explicitly de-allocate memory and automatically reduces memory fragmentation. Typically, a garbage collection procedure reclaims and provides storage dynamically rather than specifically allocating and de-allocating memory on a scheduled basis
"Mark/sweep" is a specific type of garbage collection procedure that reclaims space by first marking all memory occupied by live objects on a first of two passes through memory. The first pass begins with at least one live anchor object (required for program operation) or root. This procedure involves recursively following pointers from a set of base pointers and setting mark bits (a bit on an object is set to one) within a memory allocation table for each storage area identified by a pointer within an actively used storage area. At the beginning of the mark function, all mark bits in the memory allocation table are set to zero. An initial mark bit, corresponding to the beginning of the live object, is set to one and mark bits corresponding to succeeding groups of bytes in the live object remain at zero. Mark bits, including the initial mark bit, remain at zero if the associated object is not a live object. After marking live objects, a second pass examines each object ("sweeping") by enumerating each object. As long as at least one live reference exists for an object, that object cannot be garbage-collected and storage containing the object may not be de-allocated; otherwise, the memory formerly occupied by the object is then de-allocated and made available for future allocations. Adjacent dead objects that have become available are coalesced into larger a free areas.
Referring to FIG. 5A-5B, a high-level flow diagram of the mark/sweep method is illustrated. As the collection process is for determining live objects, the mark function begins at step 500, which depicts the beginning of a mark/sweep garbage collecting process wherein all mark bits are reset to zero, whether live or not. The process proceeds to step 502, which illustrates the process emptying the "pending" list (a list containing live objects that have been marked and are to be scanned later). The process then passes to step 504, which depicts the mark function setting a mark bit for each root object (an object known to be live). The process passes to step 506, which illustrates each root object being added to the pending list. The process proceeds to step 507, which depicts a determination of whether or not the pending list is empty. If so, the process passes to the sweep function as described in FIG. 5B. If the pending list is not empty, the process continues to step 508, which depicts the process extracting an object (live) from the pending list.
The process proceeds next to step 510, which illustrates a determination of whether or not the extracted object references any other object. If not, the process returns to step 507. If the object is determined to reference another object, the process instead passes from step 510 to step 512, which illustrates a determination of whether or not all referenced objects are marked. If so, the process proceeds to step 507 and repeats the cycle. If not, the process instead proceeds to step 514, which illustrates adding all referenced unmarked objects to the pending list. The process then continues to step 516, which illustrates marking all referenced unmarked objects. The process next passes to step 507, to check if the pending list is empty. The process proceeds to the sweep function of the mark/sweep garbage collection process if the list is empty. If the list is not empty, the process returns to step 508.
Referring now to FIG. 5B, the sweep phase of the mark/sweep garbage collection process is depicted. The process begins with step 518, which depicts the process finding the first object. The process then proceeds to step 520, which illustrates inspecting the object. The process then passes to step 522, which depicts a determination of whether or not the object is live by testing the mark bit. If the object is not live the process proceeds to step 524, which illustrates the sweep function de-allocating the object and reclaiming the space. The process continues to step 526, which depicts the de-allocated object being coalesced with any adjacent free space. The process next passes to step 528, which illustrates a determination of whether or not all objects in memory have been swept.
Returning to step 522, if the object is live, the sweep function ignores the object and the process then passes to step 528 where the determination is made whether all objects have been swept. If all objects have been swept, the process continues to step 532, which depicts the mark/sweep garbage collection procedure returning to an idle state. If not, the process instead passes to step 530 which illustrates the sweep function seeking another object.
Since the sweep algorithm examines each object on the heap sequentially, finding the "next" object must be extremely fast. Many mark/sweep algorithms choose to place a length field on each object, so that each object begins with a "length" field that describes the object's size. The next object (actually the length field of the next object) is found by adding the current object's length field to the address of the length field itself. For example:
L1 Length of A First word of A . . . Last word of A L2 Length of B First word of B Last word of B . . .
L1 and L2 represents the addresses of the objects A and B, respectively. Given L1 we know where A is and B is found by adding "Length of A" to L1 to get L2. However, the length fields (Length of A, Length of B, etc.) occupy space in the heap. Since most objects tend to be small, a considerable amount of heap space is occupied by the length fields. If the length field information could be shared across many objects, memory in the heap could be conserved for program use.
The mark/sweep garbage collection scheme causes the system to stop all other operations while it is making the two passes through memory. The scheme processes live objects over and over and causes a considerable burden on the processor. Garbage collection runs repetitively and although its main benefit is to automatically reclaim memory storage to reduce memory design errors, it does expend processor time and interrupt ongoing applications. Since garbage collection is automatic and beyond the programmer's control, the collection process may become active at inconvenient times during an active program application.
"Copying" type garbage collectors move all live objects from one area to another area. In the new area, the live objects are copied to adjacent spaces, thereby eliminating any free space regions that may have separated them in the old area. The space the live objects were moved from is then considered empty. All the free space is treated as a single aggregate and no attention is paid to any dead object. However, implementing a copying scheme requires that any individual live object be movable during garbage collection. A disadvantage with copying garbage collectors is that garbage collection takes place more frequently because only half of the available memory is used at any one time.
It would be desirable, therefore, to provide a method and apparatus that would tolerate situations where objects may not be moved. It would also be desirable to reduce the burden on the system processor(s) by eliminating the need to examine each object in memory. It would further be desirable to provide a method and apparatus that would reduce the time required to interrupt an ongoing program application.