1. Technical Field
The present invention relates generally to a technique for automatically reclaiming the memory space which is occupied by data objects referred as garbage that the running program threads will not access any longer and relates particularly to a method of delaying space allocation for parallel copying garbage collection.
2. Prior Art
Garbage collection is the automatic reclamation of computer storage. While in many systems programmers must explicitly reclaim heap memory at some point in the program, by using a  less than  less than free greater than  greater than  or less than  less than dispose greater than  greater than  statement, garbage collected systems free the programmer from this burden. The garbage collector""s function is to find data objects that are no longer in use and make their space available for reuse by the running program. An object is considered garbage, and subject to reclamation, if it is not reachable by the running program thread via any path of pointer traversals. Live (potentially reachable) objects are preserved by the collector, ensuring that the program can never traverse a  less than  less than dangling pointer greater than  greater than  into a deallocated object.
The basic functioning of a garbage collector consists, abstractly speaking, of two parts
1. Distinguishing the live objects from the garbage in some way, or garbage detection, and
2. Reclaiming the garbage objects"" storage, so that the running program thread can use it.
In practice, these two phases may be functionally or temporally interleaved, and the reclamation technique is strongly dependent on the garbage detection technique.
In general, the garbage collectors use a less than  less than liveness greater than  greater than  criterion that is somewhat more conservative than those used by other systems. This criterion is defined in terms of a root set and reachability from these roots. At the point when garbage collection occurs, all globally visible variables of active procedures are considered live, and so are the local variables of any active procedures. The root set therefore consists of the global variables, local variables in the activation stack, and any registers used by active procedures. Heap objects directly reachable from any of these variables could be accessed by the running program thread, so they must be preserved. In addition, since the program might traverse pointers from those objects to reach other objects, any object reachable from a live object is also live. Thus, the set of live objects is simply the set of objects on any directed path of pointers from the roots.
Any object that is not reachable from the root set is garbage, i.e., useless, because there is no legal sequence of program actions that would allow the program to reach that object. Garbage objects therefore cannot affect the course of the computation, and their space may be safely reclaimed.
Given the basic two-part operation of a garbage collector, several variations are possible. The first part, that is distinguishing live objects from garbage, may be done by several methods. Among them, copying garbage collection does not really collect garbage. Rather, it moves all of the live objects into one area of the heap (space in the memory where all objects are held) whereas the area of objects that were copied can be reused for new objects.
A very common kind of copying garbage collection is the semi-space collector. In this scheme, the space devoted to the heap is subdivided into two parts, a current area or from-space and a reserved area or to-space. During normal program execution, only the from-space is in use. When the running program thread requests an allocation that will not fit in the unused area of the from-space, the program thread is stopped and the copying garbage collector is called to reclaim space. The roles of the current area and reserved area are flipped, that is all the live data is copied from the from-space to the to-space.
Once the copying is completed, the to-space is made the current area and program execution is resumed. Thus, the roles of the two spaces are reversed each time the garbage collector is invoked.
The advantages of the copying garbage collection are that the heap is compacted during each collection, the low complexity of the algorithm which touches only the live objects (rather than all the heap as in the Mark and Sweep algorithm) and the simplicity of allocation.
The algorithm used in the copying garbage collection are as follows:
1. Stop the program threads
2. Flip the roles of from-space and to-space
3. Scan the roots in each program thread (mutator) and also the global roots.xe2x80x94each object referenced by a root (a son of a root) is copied into to-space if it is not yet copied and a forwarding pointer is written in the original object of from-space, and
the root pointer is updated to point to the new copy of the object in to-space.
4. Scan to-space
each son of an object in to-space is copied into to-space if it is not yet copied, and a forwarding pointer is written in the original son, and
the pointer in the father object is updated to point to the new replica of the son in to-space.
5. Reclaim the from-space area
6. Release the program threads
Note that, although it is preferable to copy the objects after flipping the roles of from-space and to-space, the operation of flipping could be done after the operation of copying the objects.
One important part of the above algorithm is the part where the collector thread checks the roots and the live objects in the heap, and copies the live objects into to-space. The location of the copied objects in to-space is determined by modifying a single allocation pointer. But, working with this single pointer is no longer efficient without synchronization when several collector threads perform the copies since they compete on the same resource that is the pointer, causing thus an unacceptable contention. Another problem raised with several collector threads is the synchronization of the operation thereof since they use the same heap.
A solution is to let the collector threads allocate more space than they actually need. Namely, when a collector thread needs to copy an object into to-space, it actually allocates a big area (e.g. one page), copies the object, and keeps copying the next objects into this area until there is no more room and a new area has to be allocated. This method solves the contention problem in allocations to to-space since these allocations become much less frequent. However, a new problem arises: the fragmentation of to-space due to unused  less than  less than holes greater than  greater than  in the heap. To remedy this problem, it has been suggested that each collector thread allocates several areas, each area being characterized by the size of the objects allocated therein (each area is associated with a range of sizes). This scheme reduces the waste of space to half the space at worst case and less in practice. However, it requires a complicated management and it does not overcome the fragmentation problem.
Accordingly, the object of the invention is to achieve a method of space allocation for copying garbage collection allowing a parallel work of several parallel collector threads thereby achieving low contention.
Another object of the invention is to achieve a method of delaying space allocation for parallel copying garbage collection which completely eliminates the fragmentation of the heap.
Another object of the invention is to provide a data processing system implementing a method wherein the space allocation for copying garbage collection using several parallel collector threads is delayed until the accumulated size of the space required for copying a number of scanned objects reaches a predetermined value, thereby achieving low contention and eliminating the fragmentation of the heap.
Therefore, the invention relates to a method of delaying space allocation for parallel copying garbage collection in a data processing system comprising a memory divided in a current area (from-space) used by at least a program thread during current program execution and a reserved area (to-space), and wherein a copying garbage collection is run in parallel by several collector threads, the garbage collection consisting in, after the program threads have been stopped, flipping the roles of the current area and reserved area and copying into the reserved area the live objects stored in the current area, the allocation of space in the reserved area to copy the live objects therein being delayed until the accumulated size of the space required for copying a number of checked objects reaches a predetermined value.
The method of the invention comprises the steps of checking by one collector thread the live objects of the current area to be copied in the reserved area, the live objects being referenced by a list of pointers, storing, for each live object, a record into an allocation log, this record including at least the address of the pointer to the object, adding the size of the object to a total_size which is the accumulated size of all the checked objects for which a record has been stored in the allocation log, and copying all the checked objects into the reserved area when the value of total_size reaches a predetermined allocation bound.