The present invention relates to a method and apparatus for scheduling garbage collection instructions for execution with instructions of other processes, and particularly to the scheduling of garbage collection instructions for processors having instruction level parallelism.
A current generation of computer processor architecture available provides the capability for instruction level parallelism, that is, the execution of multiple concurrent instructions in a single clock cycle. The instruction issue register for such a processor is typically divided into a number of slots. In a single clock cycle, the processor can process an instruction in each slot. Examples of processor architectures that provide such features are the Superscalar architecture and the Very Long Instruction Word (VLIW) architecture.
For a processor to be able to execute multiple concurrent instructions, each instruction and its effects must be independent of other instructions to be executed in the same clock cycle. For example, an instruction which doubled the value of a numerical variable could not be processed in the same clock cycle as an instruction which copied the value of the same variable to another variable. The requirement to determine which instructions are independent of each other, and could therefore be processed concurrently, has been solved in a number of different ways. In the Superscalar architecture, dedicated hardware has been implemented to determine independent instructions arriving at the instruction issue register. In the VLIW architecture, a program compiler has been implemented to generate very long instruction words consisting of a number of independent instructions concatenated together, a single VLIW being executed by the processor during each clock cycle.
Both approaches, however, suffer the same limitation. Very few programs that are run on VLIW, Superscalar or similar architecture processors have sufficient number of independent instructions to occupy all the slots of the processor all of the time. Generally, only multimedia applications, such as sound or image processing, where a large amount of processing is required to be performed on a large number of independent elements come close to occupying all the slots of the processor. Whilst the user of the computer having the processor would notice no adverse effects from unused slots of the processor, it is desirable that the most efficient use of the processor and its concurrent processing capabilities is made.
In Sun Microsystems"" Java (copyright) and some other languages and programming environments, such as Modula-3 and Cedar, a garbage collection process is run in parallel to a program process.
Garbage collection is the automated reclamation of system memory space after its last use by a programme. A number of examples of garbage collecting techniques are discussed in xe2x80x9cGarbage Collection-Algorithms for Automatic Dynamic Memory Managementxe2x80x9d by R. Jones et al, pub. John Wiley and Sons 1996, ISBN 0-471-94148-4, at pages 1 to 18, and xe2x80x9cUniprocessor Garbage Collection Techniquesxe2x80x9d by P. R. Wilson, Proceedings of the 1992 International Workshop on Memory Management, St. Malo, France, September 1992. Whilst the storage requirements of many computer programs are simple and predictable, with memory allocation and recovery being handled by the programmer or a compiler, there is a trend toward functional languages having more complex patterns of execution such that the lifetimes of particular data structures can no longer be determined prior to run-time and hence automated reclamation of this storage, as the program runs, is essential.
A common feature of a number of garbage collection reclamation techniques, as described in the above-mentioned Wilson reference, is incrementally traversing the data structure formed by referencing pointers carried by separately stored data objects. The technique involves first marking all stored objects that are still reachable by other stored objects or from external locations by tracing a path or paths through the pointers linking data objects.
This may be followed by sweeping or compacting the memoryxe2x80x94that is to say examining every object stored in the memory to determine the unmarked objects whose space may then be reclaimed.
Normally, the garbage collection and reclamation process runs on the computer in parallel to a program process, the garbage collector and reclamation process operating on the heap (memory area) occupied by data objects of the program process, so that garbage from the program process can be detected as soon as possible and the appropriate resources reclaimed.
In order to implement a garbage collection process in addition to a program process, each is normally executed as a separate thread operating on a shared heap. The execution of the processes in separate threads reduces the performance of both processes as they both must share the same processor resources. While one thread is being processed, the other may be suspended and vice-versa.
On the VLIW processor, each thread is likely to be compiled and executed separately with the processor resources being swapped alternately between the two threads.
According to the present invention, there is provided a method of scheduling instructions to be executed concurrently by a processor, the processor being capable of executing a predetermined number of instructions concurrently, the method comprising the steps of: interleaving instructions from a first process and a second process according to a predetermined rule to give a third process; and scheduling instructions from the third process for execution at a first time point by the processor, wherein instructions of the first process generate data structures comprising data objects linked by identifying pointers in a memory heap, and wherein the second process comprises a garbage collection process for traversing the memory heap and reclaiming memory allocated to data structures unused by the first process.
An advantage of the present invention is that unused concurrent execution resources of the processor are utilised for garbage collection without affecting the process being executed.
Preferably, the predetermined rule comprises scheduling instructions from the first process, determining whether there are less than the predetermined number of instructions scheduled for concurrent execution at the first time point, and if so, scheduling instructions from the second process for execution at the first time point.
By monitoring the processors capacity for further instructions, the garbage collection can be adaptively scheduled alongside a process without reducing the concurrent processing resources available to the process.
Alternatively, the predetermined rule may comprise the selection of alternate sets of instructions from the first and second processes. In another alternative, the predetermined rule may include the steps of determining the effect of scheduling instructions from the second process and, if detrimental, reducing the number of scheduled second process instructions.
Garbage collection instructions interleaved from the second process may take much more time to process than instructions from the first process. By selecting alternate sets or monitoring the effect of instructions from the second process, delaying effects of garbage collection instructions can be reduced accordingly.
According to the present invention, there is provided a data processing apparatus comprising a processor being capable of executing a predetermined number of instructions concurrently coupled with a random access memory containing a data structure comprising data objects linked by identifying pointers, the apparatus being configured to provide the following for operating on the stored plurality of data objects:
first means for interleaving instructions from a first process and a second process according to a predetermined rule to give a third process; and
second means for scheduling instructions from the third process for execution at a first time point by the processor,
wherein instructions of the first process generate the data structures in a memory heap, and wherein the second process comprises a garbage collection process for traversing the memory heap and reclaiming memory allocated to data structures unused by the first process.
The first and second means may comprise a program interpreter for executing instructions on the processor. The first and second means may comprise a program compiler for executing instructions on the processor. Alternatively, the first and second means comprise an instruction processing means for assembling and passing instructions to be executed concurrently to the processor.