1. Field of the Invention
The present invention relates to an information processing device, especially a device in a pipeline processing system, super-scalar processing system, or an out-of-order execution processing system, and more specifically to an instruction fetch control apparatus in an instruction processing device for use in quickly executing a sequence of instructions including a call instruction and a return instruction of a subroutine.
2. Description of the Related Art
In an instruction processing device in a pipeline processing system, a super-scalar processing system, or an out-of-order-execution processing system, the performance has been improved by sequentially inputting a subsequent sequence of instructions to a plurality of pipelines and starting the execution without waiting for the completion of the execution of one instruction. However, when an execution result of a preceding instruction has an influence of the execution of a subsequent instruction, the execution of the subsequent instruction cannot be started without waiting for the completion of the execution of the preceding instruction. Thus, the pipeline-stall causes the pipeline performance to be deteriorated. A typical example is a branch instruction.
Pipeline-stall operates because it is not certain whether or not a branch is taken, or because a branch instruction is not assigned the address of a branched-to instruction until the execution is completed. Therefore, a branch instruction has been developed such that it can be quickly processed using a branch history.
A branch history is used to execute a subsequent instruction or a branched-to instruction when a branch instruction is executed before it becomes certain whether or not a branch is taken.
When it becomes certain as a result of executing a branch instruction that a branch is taken, the address of a branched-to instruction and the address of the branch instruction itself are registered. When an instruction is fetched from the main storage before executing the instruction, it is indexed.
In addition, a sequence of instructions is often executed in a subroutine. Assume that a subroutine is called from a main routine, and then control is returned from the subroutine to the main routine. This process is no other than executing a branch instruction. Considering a case in which control is returned from a subroutine to a main routine, an address of a branched-to instruction is changed as necessary in most cases.
FIG. 1A shows an example of a sequence of instructions containing a subroutine. In FIG. 1A, when control branches from an instruction (1) (branch instruction) in a main routine to an instruction (5) in a subroutine, then branches from an instruction (6) (branch instruction) to an instruction (2) in the main routine, then branches from an instruction (3) (branch instruction) in the main routine to the instruction (5) in the subroutine, and then branches from the instruction (6) (branch instruction) to an instruction (4) in the main routine, a branched-to instruction of the instruction (6) is changed into the instruction (2) and the instruction (4) each time the instruction branches.
In the current branch history, if a branch has already been taken using a branch instruction, the instruction address of the branch instruction and the address of a branched-to instruction are registered together, and an instruction is fetched from the main storage and executed, then the instruction is fetched at the address of a branched-to instruction obtained as a result of indexing an instruction prior to the execution. At this time, when the address of a branched-to instruction of a branch instruction is changed for any factor, it is obvious that the address of the branched-to instruction obtained as a result of indexing the branch history is nullified. Therefore, the process being performed is canceled, and an instruction is fetched again at an address of a correct branched-to instruction.
If the above described phenomena repeatedly appear, the address of a branched-to instruction changes although the same branch instruction is executed. As a result, even if an instruction is fetched at the address of a branched-to instruction obtained as a result of indexing a branch history, the address of a branched-to instruction is nullified, the process being performed is canceled, and an instruction should be fetched again at the address of a correct branched-to instruction.
In FIG. 1A, when control first branches from the instruction (6) (branch instruction) to the instruction (2), the instruction address of the instruction (6) and the instruction address of a branched-to instruction (instruction (2)) are registered together. On the other hand, when the instruction (6) appears in the sequence of instructions again, the instruction (2) can be passed to an instruction fetch pipeline without a loss by indexing a branch history because the instruction address of the instruction (6) and the instruction address of a branched-to instruction (instruction (2)) are registered together. However, since the branched-to instruction from the instruction (6) is actually the instruction (4), the process is canceled halfway, and an instruction is fetched at the address of a correct instruction (4). As a result, a loss of 6xcfx84 is detected from the execution of the instruction (6) to the execution of the instruction (4). FIG. 1B shows an example of an operation.
As described above, if a branch is taken using a branch history, a combination of an address of a branch instruction and an instruction address is registered in the branch history, and a branch instruction having the same address appears in a sequence of instructions, then the branched-to sequence of instructions can be executed using the registered instruction address as a predicted instruction address, thereby performing a process at a higher speed. However, when the address of a branched-to instruction changes, an execution result becomes invalid if an instruction is fetched using the address of a branched-to instruction obtained as a result of searching the branch history. Therefore, an instruction should be fetched again using a correct branched-to address. As a result, there arises the problem that the branch history cannot be made the most of.
The present invention aims at processing a branch instruction, especially a sequence of instructions containing a subroutine at a high speed using a return address stack storing a return address corresponding to a call instruction of a subroutine.
According to the first aspect of the present invention, the instruction fetch control apparatus is designed to have an address matching detection unit. When an instruction which has been fetched from the main storage device and has been detected as a hit in the branch history is a return instruction of a subroutine, the address matching detection unit compares the address of a branched-to instruction registered in the branch history with all return addresses stored in the valid entries in the return address stack, and transmits a matching address as a return address of the return instruction to an instruction fetch unit for fetching an instruction.
According to the second aspect of the present invention, the instruction fetch control apparatus is designed to have an entry designation unit. When an instruction which has been fetched from the main storage device and has been detected as a hit in the branch history is a return instruction of a subroutine, the entry designation unit designates an entry in a plurality of entries in the return address stack as an entry storing the return address of the return instruction.