The present invention relates generally to a technique for automatically reclaiming the memory space which is occupied by data objects, referred to as garbage, that the running program will not access any longer and relates particularly to a method of replication-based garbage collection in a multiprocessor environment.
Garbage collection is the automatic reclamation of computer storage. While in many systems programmers must explicitly reclaim heap memory at some point in the program, by using a  less than  less than free greater than  greater than  or  less than  less than dispose greater than  greater than  statement, garbage collected systems free the programmer from this burden. The garbage collector""s function is to find data objects that are no longer in use and make their space available for reuse by the running program. An object is considered garbage, and subject to reclamation, if it is not reachable by the running program via any path of pointer traversals. Live (potentially reachable) objects are preserved by the collector, ensuring that the program can never traverse a  less than  less than dangling pointer greater than  greater than  into a deallocated object.
The basic functioning of a garbage collector consists, abstractly speaking, of two parts:
1. Distinguishing the live objects from the garbage in some way, or garbage detection, and
2. Reclaiming the garbage objects"" storage, so that the running program can use it.
In practice, these two phases may be functionally or temporally interleaved, and the reclamation technique is strongly dependent on the garbage detection technique.
In general, the garbage colectors use a  less than  less than liveness greater than  greater than  criterion that is somewhat more onservative than those used by other systems. This criterion is defined in terms of a root set and reachability from these roots. At the point when garbage collection occurs, all globally visible variables of active procedures are considered live, and so are the local variables of any active procedures. The root set therefore consists of the global variables; local variables in the activation stack, and any registers used by active procedures. Heap objects directly reachable from any of these variables could be accessed by the running program, so they must be preserved. In addition, since the program might traverse pointers from those objects to reach other objects, any object reachable from a live object is also live. Thus, the set of live objects is simply the set of objects on any directed path of pointers from the roots. Any object that is not reachable from the root set is garbage, i.e., useless, because there is no legal sequence of program actions that would allow the program to reach that object. Garbage objects therefore cannot affect the future course of the computation, and their space may be safely reclaimed.
Given the basic two-part operation of a garbage collector, several variations are possible. The first part, that of distinguishing live objects from garbage, may be done by several methods. Among them, the method of copying garbage collection does not really collect garbage. Rather, it moves all of the live objects into one area of the heap (space in the memory where all objects are held) whereas the area of reclaimed objects can be reused for new objects.
A very common kind of copying garbage collection is the semi-space collector. In this scheme, the space devoted to the heap is subdivided into two parts, a current area or from-space and a reserved area or to-space. During normal program execution, only the from-space is in use. When the running program requests an allocation that will not fit in the unused area of the from-space, the program is stopped and the copying garbage collector is called to reclaim space. The roles of the current area and reserved area are flipped, that is all the live data are copied from the from-space to the to-space. Once the copying is completed, the to-space is made the current area and program execution is resumed. Thus, the roles of the two spaces are reversed each time the garbage collector is invoked.
The technique of replication-based garbage collection is to let the collector work in parallel with the program threads or mutators. In contrast to previous copying garbage collection algorithms, replication-based garbage collection delays the flip until the end of the collection cycle. While the mutators keep running and operate on from-space, the collector replicates the live objects from the from-space to the to-space. Finally, in the flip stage, the mutators are stopped and then roots are updated to point to the replicated objects in the to-space.
But, while the replication is executed, objects in from-space keep on changing and this has to be reflected in the to-space replica. In order to make the replica consistent, the mutators log all modifications to a mutation log. The collector flips after it has cleared the mutation log (i.e., has applied each update on the replica). Really, the collector stops the mutator threads for a short pause during which the collector updates the mutator roots, and then flips the roles of from-space and to-space.
However, the above replication-based garbage collection is not suitable for a modern multiprocessor system wherein it is not guaranteed that the operations executed by one processor always appear in the same order in the view of another processor. Thus, it is possible that the collector will see the update of a location only after it reads the update to the mutation log. From the collector standpoint, this means that it might copy the contents of the location before the new value actually appears in its view. As a consequence, the new replica in to-space will contain an outdated value of the location which, furthermore, will never be updated.
Accordingly, the main object of the invention is to provide a new method of replication-based garbage collection which can be run in a multiprocessor system without the risk that the contents of memory locations will be replicated from the current area to the reserved area without their updates being taken into consideration.
Therefore, the invention relates to an improved method of replication-based garbage collection in a multiprocessing system comprising a plurality of processors, a memory divided into a current area (from-space) used by the processors during current program execution and a reserved area (to-space), and at least one garbage collector for performing, when necessary, a garbage collection consisting of flipping the roles of the current area and reserved area after all the live objects stored in the current area have been copied into the reserved area and for reclaiming the current area after the flipping operation. Several program threads (mutators) are currently running in parallel and the garbage collector performs the garbage collection in parallel with the program threads, the flipping operation being performed after the program threads have been stopped and the garbage collection has been completed. The method of replication-based garbage collection comprises the steps of storing, during normal program execution, a record in a local buffer allocated to each program thread each time this program thread updates a memory location, and adding this local buffer, when full, to a global list of buffers using a first synchronization operation, and, during garbage collection, removing the local buffers one by one from the global list of buffers using a second synchronization operation, looping over records in each removed local buffer, and copying the updated memory locations into the reserved area until the global list is empty.