The present invention relates to a method and apparatus for handling stored data and particularly, but not exclusively, to memory compaction and garbage collection procedures executing in real time in real or virtual memory space of a data processing apparatus.
Garbage collection is the automated reclamation of system memory space after its last use by a programme. A number of examples of garbage collecting techniques are discussed in xe2x80x9cGarbage Collection: Algorithms for Automatic Dynamic Memory Managementxe2x80x9d by R. Jones et al, pub. John Wiley and Sons 1996, ISBN 0-471-94148-4, at pages 1 to 18. Whilst the storage requirements of many computer programs are simple and predictable, with memory allocation and recovery being handled by the programmer or a compiler, there is a trend toward functional languages having more complex patterns of execution such that the lifetimes of particular data structures can no longer be determined prior to run-time and hence automated reclamation of this storage, as the program runs, is essential.
One known technique for garbage collection is through reference counting: each stored data object contains a field holding a count of the number of other objects that reference it. An object with a reference count of zero is unreferenced by any other object and is hence available for deletion by the memory management system. When an object becomes referenced from another object, the count is incremented. When a reference to the object is known to be cleared, the count may decrement, and when the count is decremented to zero the object may be immediately deleted. The count will have an upper limit, at which the count may become fixed (i.e. no longer decrement), with further changes due to manipulations of references not being recorded and the object becoming permanently locked against deletion - even when it legitimately becomes garbage.
A refinement to the above is a regenerative reference count system which uses a mark-sweep algorithm to mark all referenced objects in the heap. Before the mark stage, all object counts are zero. Each time an object is reached by the marking process, the count is incremented. With the reset to zero for each sweep, the problems encountered at upper limits for the count are no longer a permanent issue and can be recovered from. However, both variations require interruption to the system following marking whilst the objects with reference counts of zero are deleted or reclaimed. An improvement in terms of system performance comes from pipelining the marking and sweeping/deletion operations such as to greatly reduce the time that the system operation must be halted for sweeping of the heap. However, because a reference count must be constructed by the mark process before it can be used by the sweeper, this approach is incompatible with a reference count determination as to whether or not an object can be deleted.
It is therefore an object of the present invention to provide a garbage collection mechanism which provides the benefits of a regenerative reference counting mechanism whilst having also the benefits of a concurrent mark- sweep arrangement.
In accordance with the present invention there is provided a method of garbage collection for use in data processing apparatus wherein the memory contains a plurality of data objects, each said data object being at a respective known location within the memory and being accessed via respective pointers carried by memory stacks, the method comprising periodically traversing the memory to mark, for each object, a count of the number of extant pointers thereto from any source and, on detection that an object""s count has reached zero, deleting that object; characterised in that the operations of marking and of deleting objects proceed concurrently and, for each object, separate counts are maintained during a traversal of the number of pointers detected during the ongoing mark traversal and the total number of pointers detected during the previous mark traversal, wherein an object is not deleted if either count holds a non-zero value. By maintaining the recorded count from a preceding traversal, and bringing this into the decision as to whether or not an object may be deleted, concurrency of marking and sweeping becomes possible, as will be described in greater detail with reference to embodiments of the invention hereinafter.
Suitably, each data object maintains a pair of fields into one of which the number of pointers detected during the ongoing mark traversal is written and into the other of which the total number of pointers detected during the previous mark traversal is written, with the mapping of values for the number of pointers detected during the ongoing mark traversal suitably being alternated between fields following each traversal. Alternating the mapping simplifies matters as it is not necessary to move the stored pointers data for every object.
In addition to the reference counts, each object preferably further maintains a mark state indicator identifying, for that data object, whether it has been checked during mark traversal and, if so, whether any pointers to the object have been detected. Such a mark state indicator may alternatively indicate whether the data object is available for deletion by a sweep utility periodically traversing the memory, such an indication overriding the effects of any count value settings. This feature would allow the instant deletion of those objects identified as deletable, without the need for one or two mark traversals to occur to reduce the stored reference counts to zero. In operation, detection at any time of the establishment of a new pointer to an object causes the stored total for pointers detected during a previous sweep to be increased by one, whereas detection of the removal of an existing pointer to an object causes the stored total for pointers detected during a previous sweep to be reduced by one, and the object to be marked available for (immediate) deletion if said stored total drops to zero.
Also in accordance with the present invention there is provided a data processing apparatus comprising a processor coupled with a random access memory containing a plurality of stored data objects, each said data object being at a respective known location within the memory and being accessed via respective pointers carried by memory stacks, the processor being configured to periodically sweep the memory and to mark in memory, for each object, a count of the number of extant pointers thereto from any source, and to detect when an object""s count has reached zero, and delete that object; characterised in that the processor is configured to implement the operations of marking and of deleting objects concurrently and, for each object, to maintain in respective storage areas separate counts during a sweep of the number of pointers detected during the ongoing mark traversal and the total number of pointers detected during the previous mark traversal, wherein an objects count is taken by the processor as zero when both present and preceding sweep counts are zero. As above, the processors"" writing of values for the number of pointers detected during the ongoing mark traversal is preferably alternated between the said respective storage areas (data fields for is the respective counts) following each traversal.