(1) Field of the Invention
This invention relates to a schedule control program and a schedule control method, and more particularly to a schedule control program that causes a computer to allocate threads to a plurality of processor devices to execute them, and a schedule control method for allocating threads to a plurality of processor devices to execute them.
(2) Description of the Related Art
In an OS (Operating System) compatible with a multiprocessor system that executes a program with a plurality of CPUs (Central Processing Unit), the program to be executed is divided into a plurality of execution units (hereinafter, referred to as threads) and the threads are allocated to the CPUs by a scheduler of the OS, to thereby execute the program. Parallel execution of one program using a plurality of CPUs saves the execution time of the program and balances loads on the CPUs.
The scheduler selects, for each CPU, a thread with the highest priority in a queue (hereinafter, referred to as a run queue) where a series of threads are waiting to be executed, and causes the CPU corresponding to the run queue of the selected thread to execute the thread. In this connection, an execution start time is recorded for each thread.
FIG. 7 is a view schematically showing how a conventional scheduler allocates a thread. FIG. 7(A) shows data remaining in cache memories when a certain thread is executed. FIG. 7(B) is a view showing data remaining in the cache memories when the thread is re-allocated after a predetermined period of time.
Here, a partial configuration of a computer is illustrated, in which primary cache memories (hereinafter, referred to as primary caches) 501a, 501b, 501c, and 501d are connected to CPUs 500a, 500b, 500c, and 500d, respectively. The CPUs 500a and 500b share a secondary cache memory (hereinafter, referred to as secondary cache) 502a, and the CPUs 500c and 500d share a secondary cache 502b. In this connection, the shadow areas represent data.
Referring to FIG. 7(A), the CPU 500a executes a thread 510. At this time, data to be used for the thread 510 is read from, for example, an unillustrated main memory, and is stored in the primary cache 501a and the secondary cache 502a. It should be noted that threads 511, 512, and 513 are successively connected to each other in a run queue and are waiting to be executed.
By the way, unlike a single-processor system, a multiprocessor system does not always execute a thread with a same CPU but interrupts the CPU executing the thread after a predetermined period of time and causes the CPU to execute another thread, thereby realizing multi-tasks. When the execution of an interrupted thread 510, for example, is resumed, the scheduler selects a CPU to put the thread 510 into its run queue as follows, as shown in FIG. 7(B).
(1) If a time that elapsed from a time (execution start time) when the thread 510 last obtained an execution right to a time when the thread 510 is put into a run queue this time is within a prescribed period of time, the CPU 500a that last executed the thread 510 is selected.
(2) If the elapsed time exceeds the prescribed period of time, a CPU with the lowest load is selected from among all the CPUs 500a to 500d. 
This is because, if the elapsed time is within the prescribed period of time, it can be expected that data used for the last execution of the thread 510 remains in a cache used by the CPU 500a that last executed the thread 510. This control improves a cache hit rate and also improves performance. Load on each CPU is determined according to the number of threads 514, 515, and 516 waiting in its run queue and their priorities.
Further, as a technique of decreasing the number of cache misses in a multiprocessor system, such a technique is known that the number of blocks kept by each thread (task) in the cache of each processor device is counted, the counting result is stored in a memory shared by a plurality of processor devices, and when the execution of the thread is resumed, a processor device having a greater number of blocks in the cache is selected based on the counting result and is caused to execute the thread (for example, refer to Japanese Unexamined Patent Application Publication No. 8-30562 (paragraphs [0020] to [0039], and FIGS. 1 to 5)).
In addition, as a technique of decreasing the number of cache misses in a secondary cache, there is a known technique of always causing a specified CPU group to execute a process in a multiprocessor system comprising a plurality of CPU groups of CPUs, each of the CPU groups sharing a secondary cache, the plurality of CPU groups connected to each other (for example, refer to Japanese Unexamined Patent Application Publication No. 10-143382).
However, a conventional schedule control method of allocating an interrupted thread to a CPU based on a predetermined elapsed time counted from the last execution start time has the following drawback.
As shown in FIG. 7(B), there is such a case that the primary cache 501a does not have data which was stored at the last execution of the thread 510 but the secondary cache 502a still has the data, depending on an elapsed time. There is also such a case that the secondary cache 502a does not have the data but an unillustrated tertiary cache has the data. That is, a higher-level cache has a higher possibility of having data even after a long elapsed time. However, in the conventional schedule control, a CPU with the lowest load is selected from among all CPUs if an elapsed time exceeds a predetermined period of time, without taking such possibility into consideration. Therefore, data remaining in a cache may be wasted and the caches are not effectively used.