1. Field of the Invention
The present invention relates to a data processor having a pipeline processing mechanism, and more specifically, relates to a data processor realizing an advanced processing abilities by means of a sophisticated pipeline processing mechanism. Even more specifically, the present invention relates to a data processor capable of pre-branch processing to a return address in the initial stage of pipeline processing of a subroutine return instruction.
2. Description of the Prior Art
FIG. 1 is a schematic diagram showing a functional configuration of a typical pipeline processing mechanism.
In FIG. 1, numeral 1 designates an instruction fetch (IF) stage, numeral 2 designates an instruction decoding (D) stage, numeral 3 designates an address calculation (A) stage, numeral 4 designates an operand fetch (F) stage, numeral 5 designates an execution (E) stage and numeral 8 designates an operand writing (W) stage.
Next, the operation of the processing mechanism will now be described. A data processor as shown in FIG. 1 is configured with six pipeline stages of the instruction fetch stage 1 fetching an instruction, the instruction decoding stage 2 decoding the instruction, the address calculation stage 3 performing address calculation of an operand and the like, the operand fetch stage 4 fetching operand data, the execution stage 5 performing processing of data. And the operand writing stage writing the operand data, and the respective stages can process different instructions at the same time. However, where a conflict takes place on an operand or memory access, a lower-priority stage suspends processing until the conflict is eliminated.
As described above, in the pipelined processor, processing is divided into a plurality of stages according to the flow of data processing, and each stage is operated simultaneously, and thereby the average processing time required for one instruction is shortened and the performance as a whole is improved.
However, in the data processor pipelined in such a manner, where an instruction disturbing the flow of instructions such as a branch instruction, has been executed in the execution stage 5, all of processing performed in the preceding stages is canceled, and an instruction to be executed next is fetched anew.
Thus, when an instruction disturbing the pipeline processing is executed, the overhead of pipeline processing is increased and the processing speed of the data processor is not increased. To improve the performance of the data processor, various ideas have been practiced to curtail the overhead on executing an instruction such as a unconditional branch instruction or a conditional branch instruction.
For example, using a so-called branch target buffer storing the address of branch instruction and the branch target address in combination, the flow of instructions is predicted in the instruction fetch stage. See, for example, J. K. F. Lee and A. J. Smith, "Branch Prediction Strategies and Branch Target Buffer Design", IEEE COMPUTER Vol. 17. No. 1, January 1984, pp. 6-22.
As described above, the curtailment of the overhead at branch instruction execution is made by predicting the flow of processing in the initial stage of the pipeline processing and passing an instruction predicted to be executed next through the pipeline (hereinafter referred to as pre-branch processing). However, the prediction of the processing flow of the return instruction from the subroutine has been difficult because of the dependence of a return address from a subroutine upon an address of the corresponding subroutine call instruction.
In the conventional data processor, as described above, the return address from the subroutine depends upon the address of the corresponding subroutine call instruction in executing the return instruction from the subroutine, and therefore no effective means for predicting the flow of processing has been available.