1. Field of the Invention
The present invention relates to a data processor having a pipeline processing mechanism, and to be further detailed, relates to a data processor realizing an advanced processing ability by means of a sophisticated pipeline processing mechanism. To be further specifically, the present invention relates to a data processor capable of pre-branch processing to a return address in the initial stage of pipeline processing on a subroutine return instruction.
2. Description of the Prior Art
FIG. 1 is a schematic diagram showing a functional configuration of a typical pipeline processing mechanism.
In FIG. 1, numeral 1 designates an instruction fetch (IF) stage, numeral 2 designates an instruction decoding (D) stage, numeral 3 designates an address calculation (A) stage, numeral 4 designates an operand fetch (F) stage, numeral 5 designates an execution (E) stage and numeral 8 designates an operand writing (W) stage.
Next, description is made on operation. A data processor as shown in FIG. 1 is configurated with six pipeline stages of the instruction fetch stage 1 fetching of an instruction data, the instruction decoding stage 2 decoding the instruction data, the address calculation stage 3 performing address calculation of an operand and the like, the operand fetch stage 4 fetching operand data, the execution stage 5 performing processing of data, and the operand writing stage writing of the operand data, and the respective stages can process different instructions at the same time. However, where a conflict takes place on an operand or memory access, a lower-priority stage suspends processing until the conflict is eliminated.
As described above, in the pipelined processor, processing is divided into a plurality of stages according to the flow of data processing, and each stage is operated simultaneously, and thereby the average processing time required for one instruction is shortened and the performance as a whole is improved.
However, in the data processor pipelined in such a manner, where an instruction disturbing the flow of an instruction such as a branch instruction has been executed in the execution stage 5, all of processings having been performed in the preceding stages are canceled, and an instruction to be executed next is fetched anew.
Thus, when an instruction disturbing the pipeline processing is executed, the overhead of pipeline processing is increased and the processing speed of the data processor is not increased. To improve the performance of the data processor, various ideas have been practiced to curtail the overhead on executing an instruction such as a unconditional branch instruction or a conditional branch instruction.
For example, using a so-called branch target buffer storing the address of branch instruction and the branch target address in combination, the flow of instructions is predicted in the instruction fetch stage (Refer to J. K. F. Lee and A. J. Smith, xe2x80x9cBranch Prediction Strategies and Branch Target Buffer Designxe2x80x9d, IEEE COMPUTER Vol. 17. No.1, January 1984, pp.6-22.).
As described above, the curtailment of the overhead at branch instruction execution is made by predicting the flow of processing in the initial stage of the pipeline processing and passing an instruction predicted to be executed next through the pipeline (hereinafter referred to as pre-branch processing). However, the prediction of the processing flow of the return instruction from the subroutine has been difficult because of dependence of a return address from a subroutine upon an address of the corresponding subroutine call instruction.
In the conventional data processor, as described above, the return address from the subroutine depends upon the address of the corresponding subroutine call instruction in executing the return instruction from the subroutine, and therefore no effective means for predicting the flow of processing has been available.
The present invention has been achieved to solve the problem as described above, and the principal object thereof is to provide a data processor capable of exerting a high processing ability by making it possible to perform the pre-branch processing to the return address in the initial stage of the pipeline processing also on the subroutine return instruction.
The data processor in accordance with the present invention comprises a stack memory dedicated to a program counter (PC) (hereinafter referred to as PC stack) for storing only the return address of the subroutine return instruction.
By such a configuration, in the data processor of the present invention, the return address from the subroutine is pushed to the PC stack in executing the subroutine call instruction in the execution stage of the pipeline processing mechanism, and the pre-branch processing is performed to the address which is popped from the PC stack in decoding the subroutine return instruction in the instruction decoding stage.
The above and further objects and features of the invention will more fully be apparent from the following detailed description with accompanying drawings.