1. Field of the Invention
The present invention relates to an information processing apparatus operated in a pipeline process system (including a superscalar process system or an out-of-order process system), and more specifically to a prefetching process for quickly processing an instruction by shortening an apparent fetching time.
2. Description of the Related Art
In an information processing apparatus in which the pipeline process system is adopted, a subsequent instruction sequence is fetched and input to an instruction pipeline before the completion of a preceding instruction. In addition, since the operation of a main storage device is slow, a cache mechanism is adopted to quickly fetch the subsequent instruction sequence.
FIG. 1 shows an example of an instruction sequence containing a plurality of branch instructions.
In FIG. 1, the DR in the line (a) is a division instruction, the BCR in the line (b) is a conditional branch instruction, the BCR in the line (c) is an unconditional branch instruction, and the line (d) indicates a branch target instruction.
FIG. 2 shows the delay of the machine cycle according to the conventional technology.
First, an instruction in the line (a) is executed. When the instruction in the line (a) is executed at the A stage, the subsequent instruction in the line (b) is executed. Thus, in a speculative execution system such as the pipeline system, etc., a subsequent instruction is fetched and executed without awaiting the completion of a preceding instruction. However, since the instruction in the line (b) is a conditional branch instruction, it awaits the confirmation of the condition before the U stage after the A stage. During the wait, an instruction in the line (c) is executed. It is also a branch instruction (unconditional branch instruction), and is subject to the branch result of the line (b). Therefore, the execution at the U stage is in the wait state until it is determined that no branching occurs on the instruction in the line (b). When it is determined that no branching occurs on the instruction in the line (b), it is also determined that the unconditional branch instruction in the line (c) is executed. Therefore, the unconditional branch instruction in the line (c) is executed at the U stage. Upon the execution, an instruction to fetch the branch target instruction (NOP) in the line (d) is issued, and the branch target instruction is fetched and executed.
As clearly shown in FIG. 1, since fetching the branch target instruction in the line (d) is delayed, the instructions in the lines (a) through (c) are steadily executed in the pipeline process, but the instruction in the (d) line enters the wait state, thereby causing the delay in the pipeline process.
In the conventional technology, there is the first problem that the defect of the cache mechanism is very large, that is, a large penalty (delay time) is imposed when an instruction sequence issuing a fetch request does not hit in the cache. In this case, if the instruction fetch request is issued after confirming that the instruction fetch request is actually required, then the penalty is directly reflected in the performance difference in the case of an unsuccessful hit in the cache.
However, there is the second problem that, when the execution result of a preceding instruction affects the execution of a subsequent instruction, a correct subsequent instruction fetch request cannot be issued to execute the instruction until it is determined that the execution result of the preceding instruction never affects the execution of the subsequent instruction.
Although the branch target address of a first branch instruction is computed in the conventional technology, the branch target instruction of the first branch instruction is not fetched until the process of a second branch instruction to be processed before the first branch instruction is determined, and the execution of the first branch instruction is determined. That is, since an instruction sequence to be executed when it is determined that no branching occurs on the first branch instruction is fetched only after determining the branch condition of the second branch instruction written immediately before the first branch instruction is determined (after it is determined that the branch prediction is successfully made), there arises a loss in the execution of instructions because the start of the instruction fetch enters the wait state.