The present invention relates to a memory reclamation method and apparatus and in particular, though not exclusively, to such method and apparatus in which repeated sweeps of memory using alternative algorithms are made.
Garbage collection is the automated reclamation of system memory space after its last use by a programme. A number of examples of garbage collecting techniques are discussed in xe2x80x9cGarbage Collectionxe2x80x94Algorithms for Automatic Dynamic Memory Managementxe2x80x9d by R. Jones et al, pub. John Wiley and Sons 1996, ISBN 0-471-94148-4, at pages 1 to 18, and xe2x80x9cUniprocessor Garbage Collection Techniquesxe2x80x9d by P. R. Wilson, Proceedings of the 1992 International Workshop on Memory Management, St. Malo, France, September 1992. Whilst the storage requirements of many computer programs are simple and predictable, with memory allocation and recovery being handled by the programmer or a compiler, there is a trend toward functional languages having more complex patterns of execution such that the lifetimes of particular data structures can no longer be determined prior to run-time and hence automated reclamation of this storage, as the program runs, is essential.
A common feature of a number of garbage collection reclamation techniques, as described in the above-mentioned Wilson reference, is incrementally traversing the data structure formed by referencing pointers carried by separately stored data objects. The technique involves first marking all stored objects that are still reachable by other stored objects or from external locations by tracing a path or paths through the pointers linking data objects.
This may be followed by sweeping or compacting the memoryxe2x80x94that is to say examining every object stored in the memory to determine the unmarked objects whose space may then be reclaimed.
Each garbage collection algorithm has its own particular strengths and weaknesses. For example a mark-sweep garbage collector is able to detect all unused objects and reclaim the memory occupied by them in a single mark-sweep pass through the memory heap. However, garbage cannot be identified for certain until all used objects have been marked. In contrast, a reference counting garbage collector is able to detect unused objects and immediately reclaim memory occupied by them. Unfortunately, reference counting cannot, by itself, identify unused circular loops of objects, where the tail of a list is linked to the head.
Furthermore, some garbage collected languages and environments such as Java ((copyright)Sun Microsystems Inc.), Modula-3 and Cedar support the is concept of finalisation, which it is necessary that garbage collection algorithms cater for. Stored data objects created by a program process may have an associated finaliser procedure which is to be executed after the object is detected as unmarked and nominally becomes available for garbage collection but before the memory occupied by the data object is reclaimed. The purpose of this feature is to allow an object to clean up any other system resources the object has claimed before it is destroyed. For example, the finaliser for a Java File Object would close all the system file handles held by the object.
However, a finaliser is just a special type of procedure associated with an object with all the power of the programming language available to it. The finaliser procedure can therefore access and manipulate all data objects accessible from the object being finalised. Therefore, all objects accessible by a finaliser, such as descendant objects accessible from referencing pointers held by the data object, must be explicitly excluded from garbage collection. Furthermore, it is possible for the finaliser method to resurrect any such objects accessible by a finaliser, including the object being finalised itself, by making the object accessible to the program process again. Consequently, a garbage collection procedure cannot delete any objects that are accessible by a finalisable object until its finaliser has executed and the accessibility of the objects has been re-evaluated. In Java and other languages, the possibility of an object repeatedly resurrecting itself is typically removed by stating that the finaliser for each instance is executed only once. This control on finalisation will be assumed herein.
In PC""s or workstations, the extra processing and memory load to support finalisation is not usually a problem due to the amount of memory typically available in a PC, although the support will, of course, affect the overall efficiency. In low-memory environments such as set-top boxes, however, support for finalisers can cause problems and even a concurrent or incremental garbage collector may have to halt the program until it has executed some or all of the outstanding finalisers and reclaimed any memory used by them.
According to a first aspect of the present invention, there is provided a method of reclaiming memory space allocated to data structures comprising data objects linked by identifying pointers, in which the memory allocated to data objects is reclaimed using two systems: a first system, by which the data structure is traversed to identify those objects to which no references are made by the pointers of other objects, and reclaiming the memory allocated to those objects to which no references are made; and a second system, which determines which objects are not descendants of root objects and reclaiming the memory allocated to those objects, wherein cycles of the first system are interleaved with cycles of the second system.
The second system may determine which objects are descendants of root objects from a mark associated with each object, which mark has been set by the first system if the object is a descendant of a root object.
An advantage of the present invention is that one traversal of the heap gives information required for memory reclamation using both systems.
The cycles of the first system may reclaim the memory allocated to a group of data objects the pointers of which reference each other but none of which are referred to by the pointer of a root object.
The interleaving of first and second systems may be performed according to predetermined criteria including: cycles of the first system may be performed until no unreferenced objects are found, followed by a cycle of the second system; a cycle of the first system may be interleaved between cycles of the second system; a first number of cycles of the first system are interleaved between a second number of cycles of the second system. A global indicator may dictate from which system the next memory reclamation cycle will be derived.
The memory space may be traversed in a first direction during even numbered cycles and traversed in a second, alternate, direction during odd numbered cycles. A first mark may be associated with objects referenced by pointers of other objects found during the traversal in the first direction. A second mark may be associated with objects referenced by pointers of other objects found during the traversal in the second direction. An object found to be unreferenced by pointers of other objects traversed in one direction and not having the mark from a prior traversal in the alternate direction may be deleted.
Changes to pointers referencing objects may be monitored and the first system may only traverse a data structure to identify those objects to which no references are made by the pointers of other objects when a change to a pointer referencing a constituent object of the data structure occurs. If a change to a pointer which uniquely references an object occurs, the uniquely referenced object may be immediately deleted.
According to another aspect of the present invention, there is provided a data processing apparatus comprising a data processor coupled with a random access memory containing data structures comprising data objects linked by identifying pointers, the processor being configured to provide the following for operating on the stored plurality of data objects: first means for traversing the data structures to identify those objects to which no references are made by the pointers of other objects, and for reclaiming the memory allocated to those objects to which no references are made; and second means for determining which objects are not descendants of root objects and for reclaiming the memory allocated to those objects, wherein cycles of the first means are interleaved with cycles of the second means.
In the present invention, repeated sweeps of heap memory are performed using alternating complementary garbage collection methods to improve the efficiency of garbage collection, utilising the benefits of each method whilst avoiding their inherent weaknesses. Advantageously, the garbage collection method of the present invention identifies finaliser-accessible objects.
More advantageously, finalisable objects identified are topologically ordered and executed to avoid having to repeatedly process a finalisable object and its descendants which are descendants of another finalisable object.
Furthermore, the method of the present invention is composed of a number of simple steps, thereby permitting fine-grained incremental implementations of the garbage collector.