The present invention relates to a memory reclamation method, in particular one in which conflicting deletion attempts for a stored data object may be made.
Garbage collection is the automated reclamation of system memory space after its last use by a programme. A number of examples of garbage collecting techniques are discussed in xe2x80x9cGarbage Collection- Algorithms for Automatic Dynamic Memory Managementxe2x80x9d by R. Jones et al, pub. John Wiley and Sons 1996, ISBN 0-471-94148-4, at pages 1 to 18, and xe2x80x9cUniprocessor Garbage Collection Techniquesxe2x80x9d by P. R. Wilson, Proceedings of the 1992 International Workshop on Memory Management, St. Malo, France, September 1992. Whilst the storage requirements of many computer programs are simple and predictable, with memory allocation and recovery being handled by the programmer or a compiler, there is a trend toward functional languages having more complex patterns of execution such that the lifetimes of particular data structures can no longer be determined prior to run-time and hence automated reclamation of this storage, as the program runs, is essential.
A common feature of a number of garbage collection reclamation techniques, as described in the above-mentioned Wilson reference, is incrementally traversing the data structure formed by referencing pointers carried by separately stored data objects. The technique involves first marking all stored objects that are still reachable by other stored objects or from external locations by tracing a path or paths through the pointers linking data objects.
This may be followed by sweeping or compacting the memoryxe2x80x94that is to say examining every object stored in the memory to determine the unmarked objects whose space may then be reclaimed.
Normally, the garbage collection and reclamation process runs on the computer in parallel to a program process, the garbage collection and reclamation process operating on the heap (memory area) occupied by data objects of the program process, so that garbage from the program process can be detected as soon as possible and the appropriate resources reclaimed. A result of the two processes running in parallel to each other and operating the same memory area is that they could both be operating on the same data objects at the same time. In the event of only a single processing thread being available to the two processes, steps from the garbage collection and reclamation process are interleaved with steps of the program process. A single step of the garbage collection and reclamation process may not necessarily completely process a data object and those data objects that it has pointers referenced to.
Many computer programming languages offer functions for manual reclamation of memory used by data objects. As opposed to automatic garbage collection, to manually reclaim memory used by a data object a programmer must explicitly execute an appropriate function from within the program that created the data object. Indeed, a popular feature of Object-Oriented programming languages such as C++ is the ability to define a destructor method for an object class. A destructor method is a function written by a programmer specifically for an object class. The default operation performed by a destructor method is to delete the object it is associated with from the heap. The programmer can include calls to other functions from within a destructor method so that when the destructor method is executed for a data object created from the object class, resources held by the data object can be reclaimed, such as file handles and other objects referenced by pointers from the object, before the data object is automatically deleted from the computer""s memory heap. This method is particularly popular for reclamation of linked list and tree data structures where destructors for each object in the linked list can be recursively called until the end of the list is reached, thereby reclaiming the whole linked list from a single destructor method call. In this manner, a programmer, knowing that a program has finished with a data object, could execute the destructor method from the program so that the object and its resources can be reclaimed in an orderly fashion as soon as the object is finished with.
It will be apparent, however, that such manual memory reclamation methods conflict with the garbage collection and reclamation methods described above. If a garbage collection and reclamation process is part way through processing a series of linked data objects, when the program process manually reclaims the resources held by the data objects, the next time the garbage collection and reclamation process resumes, it will attempt to process a data object that no longer exists resulting in an error state being generated and potentially a system failure.
In Java ((copyright) Sun Microsystems Inc.) virtual machines there is no provision for manual memory reclamation methods. All memory reclamation must be performed by an automatic garbage collector. However, as Java ((copyright) Sun Microsystems Inc.) supports multiple processing and different garbage collection mechanisms have different strengths and weaknesses (for example reference counting can quickly identify 0-referenced data objects as garbage but cannot identify cyclic data structures as garbage, whilst the mark-sweep algorithm can identify most types of garbage but takes much longer to identify it), it is desirable that a number of automatic garbage collectors operate concurrently on the heap occupied by data objects. In order for garbage collectors to operate concurrently they must have some conflict resolution mechanism in the event that one garbage collector tries to reclaim a data object that another garbage collector is currently processing.
Before arriving at the solution of the present invention, a number of other possible solutions were explored. Data objects that are currently being processed by the garbage collection and reclamation process can be marked such that before a manual reclamation operation, or the data object being processed by another garbage collection and reclamation process, a mark is checked for and the operation can be aborted if the data object is found to be marked. However, by aborting the operation, reclamation of the resources held by the data object becomes reliant on the garbage collection and reclamation process recognising them as garbage and reclaiming them in the futurexe2x80x94this reclamation may be at some indefinite time in the future or, if the program process is permanently running and keeps a valid pointer referencing the data object, the data object may never be recognised as garbage and would therefore never be reclaimed.
Alternatively the garbage collection and reclamation process could place a mutual exclusion lock (by a method such as semaphores) on the data object(s) it is processing. Such a lock, however, prevents the program process from accessing and manipulating the data objects and would at least halt the execution of the program process at the point it requires access to the data object(s) until the lock(s) are released and could cause the program process to fail.
According to the present invention, there is provided a method of reclaiming memory space allocated to a data structure comprising data objects linked by identifying pointers, in which the memory allocated to data objects is reclaimed using two systems:
a first system, by which a selected part of the data structure is traversed by following the pointers, one of at least two identifiers being allocated to the data objects, a first identifier which indicates that the data object has been traversed so that the data objects referenced by the pointers of that data object have been identified, and a second identifier which indicates that the data object is referenced by a pointer, but the data object has not yet been traversed; and
a second system, by which an individual data object is selected for deletion to enable the associated memory space to be reclaimed,
wherein the second system reads the first system identifier for the individual data object, and if the first identifier is present deletes the data object thereby reclaiming the associated memory space, and if the second identifier is present, allocates a third identifier, and wherein the first system operates to reclaim the memory space allocated to data objects having the third identifier.
An advantage of the present invention is that the second system does not delete the data object if the first system has not finished traversing it, but adds a marker so that deletion will only take place when the first system is ready.
Preferably, the first system comprises an automatic garbage collection system. The invention thus eliminates the conflict between automatic garbage collection and manual memory reclamation. Preferably, the first system also operates to reclaim the memory space allocated to data objects having no identifier. The second system may be the manual deletion of data objects. This may be in the form of the deletion of data objects by a creating program of the objects when the objects are no longer needed. Alternatively and preferably, the second system comprises an automatic garbage collection system. The invention thus eliminates the conflict between the two automatic garbage collection systems. Preferably, the second system also reclaims memory space allocated to data objects referenced by pointers from the data object having the third identifier.
The invention also provides a data processing apparatus for implementing the method of the invention.