1. Field of the Invention
The present invention relates to an information processing device having a branch predicting mechanism and more particularly, to a branch predicting device predicting a branch of an instruction equivalent to a subroutine return in an architecture for which a particular instruction for a subroutine return is not prepared.
2. Description of the Related Art
For a conventional instruction processing device, its performance is attempted to be improved by sequentially starting the execution of succeeding instructions without waiting for the completion of the execution of one instruction by using the techniques such as pipeline processing, out-of-order processing, etc.
In the pipeline processing, if a preceding instruction is an instruction which changes the execution sequence of succeeding instructions, such as a branch instruction, the instruction at a branch destination must be entered to an execution pipeline when a branch is taken. Otherwise, the execution pipeline falls into disorder, and on the contrary, the performance is degraded in the worst case.
Accordingly, attempts are made to improve the performance by arranging a branch predicting mechanism, a representative of which is a branch history (branch prediction table), and by predicting whether or not a branch is taken. If it is predicted in such a device that a branch is taken, the instruction at a branch destination is entered to an execution pipeline after a branch instruction. Therefore, the execution pipeline never falls into disorder when the branch is actually taken.
Additionally, the branch destination (return destination) of a subroutine return instruction may vary at each execution from the nature of the instruction itself. This is because the location of the subroutine call instruction being a subroutine call source differs at each execution. For such an instruction, it is known that performance can be improved by arranging a dedicated branch predicting mechanism called a return address stack.
However, the above described conventional branch predicting mechanism has the following problems.
For some CPU (Central Processing Unit) architectures, particular instructions are not prepared beforehand as a subroutine call/return instruction pair. To improve the performance in such architectures by adopting a return address stack, the technique for dynamically extracting an instruction pair equivalent to a subroutine call/return from branch instructions to be executed, is required.
However, whether or not an instruction is a subroutine call/return instruction is statically determined at the time of decoding in a conventional information processing device. Therefore, programming different from the interpretation by hardware is undesirable. In this case, once the correspondence of a call/return pair differs from an actual one by undesirable programming, succeeding branch destinations are erroneously corresponded in succession from the nature of the return address stack. The more the number of the stages of the return address stack is, the worse the performance becomes.
FIG. 1 exemplifies a program including subroutine call/return instruction pairs used in such an architecture.
In this example, a subroutine S1 is called by an instruction “balr 14, 15” in a main routine (Call 1), and another subroutine S2 is further called by an instruction “balr 15, 13” in the subroutine S1 (Call 2). Then, control is returned to the subroutine S1 by a conditional return instruction “bcr 7, 15” (Return 2), and further returned to the main routine by an unconditional return instruction “bcr 15, 14” (Return 1).
Here, assume that the instruction processing device recognizes a particular operation code “balr” to be an instruction equivalent to a subroutine call, and an unconditional branch instruction “bcr 15, x” (x is arbitrary) including a particular operation code and operand to be an instruction equivalent to a subroutine return.
In this case, an instruction “bcr 7, 15” in the subroutine S2 is not recognized to be an instruction equivalent to a subroutine return, and is overlooked. Accordingly, a conventional return address stack recognizes Return 1 to be the return corresponding to Call 2, and a branch prediction results in a failure. Actually, the correct return corresponding to Call 2 is Return 2.
Additionally, if the instruction processing device simply recognizes all of instructions including the operation code “bcr” to be an instruction equivalent to a subroutine return, “bcr 4, 3” being a mere conditional branch instruction in the subroutine S2 is recognized to be the return corresponding to Call 2. Therefore, the return address stack is proved to erroneously recognize a call/return pair also in this case.
As described above, in an information processing device comprising a return address stack, it is vital to recognize a correct subroutine call/return instruction pair when instructions are executed.