The present invention relates to a method and apparatus for handling stored data and particularly, but not exclusively, to memory compaction and garbage collection in real or virtual memory space of a data processing apparatus.
Garbage collection is the automated reclamation of system memory space after its last use by a programme. A number of examples of garbage collecting techniques are discussed in xe2x80x9cGarbage Collection: Algorithms for Automatic Dynamic Memory Managementxe2x80x9d by R. Jones et al, pub. John Wiley and Sons 1996, ISBN 0-471-94148-4, at pages 1 to 18, and xe2x80x9cUniprocessor Garbage Collection Techniquesxe2x80x9d by P. R. Wilson, Proceedings of the 1992 International Workshop on Memory Management, St. Malo, France, September 1992. Whilst the storage requirements of many computer programs are simple and predictable, with memory allocation and recovery being handled by the programmer or a compiler, there is a trend toward languages having more complex patterns of execution such that the lifetimes of particular data structures can no longer be determined prior to run-time and hence automated reclamation of this storage, as the program runs, is essential.
One particular class of garbage collection/memory reclamation techniques, as described in the above-mentioned Wilson reference, is mark-sweep collection. In common with many garbage collection techniques it is a two-stage procedure and, as its name suggests, it involves first marking all stored objects that are still reachable by tracing a path or paths through the pointers linking data objects, and then sweeping the memoryxe2x80x94that is to say examining every object stored in the memory to determine the unmarked objects whose space may then be reclaimed. In other techniques, such as mark-compact and copying collection, the stored data objects are moved around in memory to form contiguous areas of xe2x80x9clivexe2x80x9d objects and garbage, with the garbage area being freed for overwriting.
In many cases, garbage collection is a system-wide task which operates on a single global heap, that is to say a single memory area where data structures or objects are stored in no specific orderxe2x80x94only with regard to whether a particular space is large enough to hold a particular object. Many languages have no concept of local storage of objects and therefore the global heap will be used for many short-lived data objects, for example those which are local to a single thread. As the same garbage collection or data object sorting techniques are typically applied to this category of data as to longer term data shared between threads, overall collection times may become very long and the load for processing this local data is transferred to the system-wide garbage collection process.
It is an object of the present invention to provide a means whereby the efficiency may be increased by distributing the processing load typically involved in garbage collection in a multi-threading environment.
In accordance with the present invention there is provided a data processing apparatus for handling multi-thread programs, the apparatus comprising a data processor coupled with a random-access memory containing a plurality of data objects, each said data object being at a respective known location within the memory and being accessed via respective pointers carried by memory stacks associated with respective threads, the apparatus being configured to periodically determine those data objects in the random-access memory having no extant pointers thereto from any source and to delete the same; characterised in that the apparatus further comprises a plurality of reference buffers, each assigned to a respective memory stack frame, each reference buffer holding pointers to each data object referred to by the respective stack frame, the apparatus being configured to clear, at the conclusion of each thread memory stack frame, the associated reference buffer and each referenced data object having no pointers thereto in any other reference buffer.
Through the use of reference buffers for each thread, those data objects referred to only by the one thread may be deleted as soon as the relevant thread memory stack section (stack frame) has cleared. In this way, these singly referenced objects may be garbage collected on a xe2x80x9clocalxe2x80x9d basis rather than congesting a global garbage collection. There is one exception to this, where pointers remain in other data objects even after all those from the stack have been cleared. To provide for this, each stored data object may include a so-called global flag set by the presence of a pointer to the data object from another data object, with the apparatus being further configured to exclude from clearance any data object having its global flag set.
In an embodiment to be described, an additional data store holding a handle table may be provided, with each referenced data object containing a pointer to a handle table entry, and each handle table entry holding a pointer to the location within the random access memory of the respective data object. With such a handle table, the apparatus may further comprise means operable to determine the number of pointers from reference buffers to each data object and to store this number as a reference count with the entry for that data object in the handle table. Alternatively, means operable to determine the number of pointers from reference buffers to each data object may be provided, but with a further data store holding this number as a reference count entry for the respective data object; in this latter case the handle table pointer to the data object location may be comprised of a pointer to the further data store reference count entry and a further pointer from that entry to the data object in the random access memory.
In an alternative configuration, the stored data objects may be kept relatively simple, that is to say without storing a pointer to a handle table entry, or any other pointers. In such a configuration, the link to the handle table entry may suitably be provided by a further pointer from the respective entry in the or each reference buffer.
The apparatus preferably includes means arranged to periodically compact the random access memory contents by moving the undeleted data objects: to avoid disturbing objects that may be required by other threads, the compaction means preferably leaves unmoved any data object with an associated reference count value greater than zero. To indicate this to the compactor, each stored data object may suitably include a lock flag which, when set, indicates a reference count value greater than zero. In a further alternative, the lock flag may instead be held by the handle table, in order to keep the size of each data object to the minimum. Further compaction may be provided if each reference buffer is of a predetermined capacity, with the apparatus further comprising means operable to detect when a reference buffer reaches fullness and arranged to perform garbage clearance for the buffer prior to conclusion of the thread memory stack frame.
In a further embodiment to be described, a further data store may be provided holding, for each thread, a respective thread reference table holding individual entries respectively marking each object referenced by the thread. With such a thread reference table, each reference buffer suitably holds, for each referenced data object, a pointer to the respective thread table entry. With such an arrangement, the above-described functionality of the reference structures is split into the reference buffer per stack frame and thread table per thread. This arrangement acts as an interface to a stack for garbage collection purposes, supporting low-overhead reference counting and removing the need for conservative scanning of the stack.
Also in accordance with the present invention there is provided a method of memory management for use in data processing apparatuses handling multi-thread programs, wherein the memory contains a plurality of data objects, each said data object being at a respective known location within the memory and being accessed via respective pointers carried by memory stacks associated with respective threads, the method comprising periodically determining those data objects in the random-access memory having no extant pointers thereto from any source and to delete the same; characterised in that, for each memory stack, reference pointers are generated for each data object referred to by the respective stack and, at the conclusion of handling of each thread memory stack frame, the associated reference pointers and each referenced data item having no other reference pointers thereto are deleted. Further features of the present invention are described in the attached claims.