1. Technical Field
The present invention relates in general to improved garbage collection and in particular to improved efficiency in handling large objects during garbage collection. Still more particularly, the method, system, and program of the present invention provides improved distribution of the memory heap among sections for efficient parallel bitwise sweep of larger objects during garbage collection.
2. Description of the Related Art
Software systems, such as the Java Virtual Machine (JVM), that employ garbage collection typically provide an explicit call for allocating objects, but no explicit call for freeing objects. Instead, in a system that employs garbage collection, when available storage on a heap is exhausted, all operations are suspended and garbage collection is invoked to replenish the free storage.
A common garbage collection algorithm is called “mark and sweep”. During the mark phase, all Java objects that are still accessible to a program are identified and marked. Next, during the sweep phase, unmarked objects of the heap are identified as free space. In particular, free space is typically identified as the space bounded by marked objects or by the beginning or end of the heap.
In particular, during the sweep phase, the free spaces that are of a sufficiently large size are considered free items that may be arranged into lists or structures that facilitate subsequent object allocation. Any free memory fragments that are not of sufficient large size are deemed unnecessary and are not included in the list.
In sweep phrase implementation, it is common for a garbage collection algorithm to require that the size of objects currently within the heap be derivable from an examination of the object. Additionally, some garbage collection algorithms may require that free memory fragments have a derivable size.
One approach for facilitating object and fragment size derivation is to include a prefix to every object and fragment, where the prefix indicates a size field. Another approach for facilitating object and fragment size derivation is by implementing a bitwise sweep algorithm. To implement a bitwise sweep algorithm, a dedicated bit array that is independent of the heap identifies marked objects where each bit represents a fixed storage size (e.g. 8 bytes) and each object is aligned to this size. At the onset of the mark phase, the dedicated bit array is cleared. Then, referenced objects are marked by setting the bit that represents the starting location of the object. Next, during the sweep phase, the bitwise sweep algorithm is implemented by scanning the dedicated bit array, searching for runs of zero bits bound on each side by marked bits. When a sufficiently large run of zeroes bound by marked bits is located, the object at the beginning of the run is examined and its size fetched. The size of the initial marked object is then subtracted from the size represented by the run of zeroes and if the resulting size is sufficiently large, the storage bound by the marked objects is considered a free item and is saved in a way to allow it to be used for subsequent object allocation.
Garbage collection becomes more complex in a multiprocessor system. In particular, a parallel garbage collector may be implemented to handle garbage collection in a multiprocessor system. A parallel garbage collector may implement sufficient helper threads to use all the available processors during garbage collection. In one example of an implementation of a parallel sweep phase, the heap may be divided among multiple sections so that each helper thread can work on unique sections and not impede the other helper threads.
When all sections have been processed by helper threads, the helper threads are suspended and the garbage collector enters single thread mode. In single thread mode, a single thread examines all the section data and may identify free items that span sections. The single thread arranges all the identified free items into appropriate structures for subsequent allocations. In particular, during single thread mode, all other threads of the Java process are suspended and only one process is used by the Java process.
When a Java application is run on a multiprocessors system, it is important that parallel garbage collectors are as efficient as possible. First, it is important to make helper threads function as efficiently as possible because when the helper threads are executing on the multiple processors, the Java application threads are suspended. In particular, it is important to have helper threads finish sweeping all sections close to the same time to reduce the amount of time that processors remain idle waiting for other helper threads to finish. Second, it is important to make the single thread mode runtime as short as possible because when the single thread executes during single thread mode, all other Java threads are suspended.
The bitwise sweep algorithm is one efficient method of facilitating sweeps when objects are small, but the current bitwise sweep algorithm is inefficient when used to facilitate object size derivation for large objects, and in particular large objects that are significantly larger than the size requirement for addition to a free list. For example, large database applications and applications that display images may instantiate objects that are several megabytes in size, while the typical free list may only require unmarked objects to be several hundred bytes to be eligible for storage in a free list. Under the current bitwise sweep, each bit representing the length of the initial marked object must be scanned and a marked bit reached before the size of the object is fetched. Thus, the current bitwise sweep algorithm is inefficient because if the size of the fetched object is large, many bits are unnecessarily scanned.
In allocating sections, multiple small sections may be allocated to promote sweep efficiency. In particular, when there are multiple small sections, the time that helper threads sweep each section is reduced, allowing the helper threads to complete sweeping all the sections more closely to each other. However, as the number of small sections increases, the amount of time required for single thread mode increases, which may effectively decrease the efficiency gained from allocating multiple small sections. In addition, while allocating multiple small sections may promote efficient sweeps of small objects, sweeping multiple small sections for larger objects is inefficient. In particular, when a larger marked object extends across multiple small sections, in a bitwise sweep, each bit of each portion of the larger object is inefficiently scanned within each section.
Therefore, in view of the foregoing, there is a need for a method, system, and program for improving the efficiency of bitwise sweeps in a parallel garbage collector and the efficiency of section dispersal to improve the efficiency of handling larger objects, and in particular for handling objects that are substantially larger than the size required for addition of the object to the free list.