A state in which a program, which shows an instruction sequence written with a text editor, etc., is executed by a processor is termed ‘process’. Processing performed by the process is divided into a plurality of portions, each termed ‘thread’. Each thread has information such as register and program counter for use in the thread concerned, and the information is termed ‘context’.
In recent years, an SMT (simultaneous multithreading) processor attracts attention, in which a plurality of threads (or processes) are simultaneously executable in one processor. In the multithreading processor, a plurality of context units are installed to preserve the contexts on a thread-by-thread basis. The multithreading processor allots a thread for each context unit, and executes a plurality of threads simultaneously.
The multithreading processor reads in (which is termed ‘fetch’) each instruction from an address specified by the program counter corresponding to each thread, and simultaneously executes the plurality of threads. Because the number of threads simultaneously executable is limited by the number of installed context units, the multithreading processor selects a thread to be executed next from among the threads in a standby state, which are not allotted to the context units at present, and switches an executable thread (which is termed context switching). In this specification of the invention, to select a thread for execution and to switch the thread by the context switching are termed ‘scheduling’.
However, according to the conventional scheduling, the multithreading processor uses the entire context units being installed, and simultaneously fetches the instructions of the entire executable threads selected at the time of context switching. Further, the thread selected at the time of the context switching does not reflect the operation state of the multithreading processor.
Accordingly, depending on the combination of the selected threads, processing is concentrated into a particular unit in the multithreading processor. This produces delay caused by resource competition which impedes efficient thread execution. As a result, it has been not possible to improve the processing efficiency, even when the threads are executed by fully using the installed context units.
For example, when a data accessed for an instruction fetch or accessed by a memory access instruction is not existent in a cache having a high-speed transfer rate, and thus an unsuccessful access to the cache (which is hereafter referred to as ‘missing cache’) occurs, an access to a main memory having a low-speed transfer rate is forced, which produces a delay. Such a case also happens in the multithreading processor. Namely, when the instructions of a plurality of threads are simultaneously fetched and executed, the processing efficiency of the multithreading processor may not be improved because of occurrence of cache competition and an increased number of missing cache times.
As one method for improving the processing efficiency in the multithreading processor, a document has been disclosed (as U.S. Pat. No. 6,247,121, “Multithreading processor with thread predictor” by Quinn A. Jacobson, issued on Jun. 12, 2001). According to this patent, in a multithreading processor, a speculative thread is generated based on a branch prediction before the execution of a branch instruction, and executed in the multithreading processor. However, in the above disclosure, the scheduling in case of a plurality of identical or different processes being existent has not been proposed.