1. Field of the Invention
The present invention relates to the field of microprocessors, and in particular, to systems and methods for predicting return addresses for call instructions.
2. Background Art
Advanced processors employ pipelining techniques to execute instructions at very high speeds. A pipelined processor is organized as a series of cascaded stages of hardware. Instruction processing is divided into a sequence of operations, and each operation is performed by hardware in a corresponding pipeline stage ("pipe stage"). Independent operations from several instructions may be processed simultaneously by different pipe stages, increasing the instruction throughput of the pipeline. By including multiple execution resources in each pipe stage, the pipelined processor can execute multiple instructions per clock cycle. To make full use of this instruction execution capability, the execution resources of the processor must be provided with sufficient instructions from the correct execution path.
Branch instructions pose major challenges to keeping the processor pipeline filled with instructions from the correct execution path. When a branch instruction is executed and the branch condition met, control flow of the processor jumps to a new code sequence, and instructions from the new code sequence are transferred to the pipeline. Branch execution typically occurs in the back end of the pipeline, while instructions are fetched at the front end of the pipeline. If changes in the control flow are not anticipated correctly, several pipe stages worth of instructions may be fetched from the wrong execution path by the time the branch is resolved and the error detected. When this occurs, the instructions must be flushed from the pipeline, leaving idle pipe stages (bubbles) until the processor refills the pipeline with instructions from the correct execution path.
To reduce the number of pipeline bubbles, processors incorporate branch prediction modules at the front ends of their pipelines. When the branch prediction module detects a branch instruction, it forecasts whether the branch direction will be taken or not taken when it is executed. If the branch direction is predicted taken, the branch prediction module indicates a target address to which control of the processor is predicted to jump. A fetch module, which is also located at the front end of the pipeline, fetches instructions beginning at the indicated target address.
Call and return instruction are branch instructions that are used to jump to and return from blocks of instructions ("subroutines"). When the call instruction is executed, control of the processor jumps to a target address at which the subroutine begins. In addition, a pointer to the instruction that follows the call instruction is pushed onto a return stack buffer (RSB). The subroutine is terminated by a return instruction, which causes processor control to jump back to the return address indicated by the RSB.
In conventional instruction sets, call and return instructions are executed unconditionally. For example, when a call instruction is detected at the front end of the processor pipeline, its return address is automatically pushed onto the return stack buffer. The call instruction is executed at the back end of the processor pipeline unless it is flushed from the pipeline prior to execution. A call instruction may be flushed, for example, when it follows the target address of a branch that is predicted taken but resolved not taken, i.e. the branch is mispredicted. In this case, the call instruction is flushed along with any other instructions on the mispredicted execution path. However, its return address is already on the RSB, since the misprediction is not detected until the back end of the pipeline.
In order to keep other return addresses synchronized with their corresponding return instructions, the RSB must be restored to its state prior to receipt of the return address for the flushed call instruction. For this purpose, a pointer to the top of the stack (TOS) prior to modification by a call instruction is saved when the RSB is updated. This TOS pointer can be used to restore the stack in the event the call instruction is flushed from the pipeline before it is executed. A similar procedure applies for return instructions.
This approach is inadequate if call and return instructions are allowed to execute conditionally. Conditional execution means that call and return instructions are not automatically taken, and mispredicted call and return instructions multiply the number and types of updates required for the RSB. The present invention addresses these and other problems raised by conditional execution of call and return instructions.