1. Field of the Invention
This invention is related to the field of processors and, more particularly, to address generation related to branch instructions within processors.
2. Description of the Related Art
Branch instructions present numerous challenges to processor designers. The existence of branch instructions in code, and the mechanisms that the processor includes to handle the branch instructions with high performance, are frequently large factors in determining the overall performance that a user may actually experience when using a system including the processor.
In addition to the performance challenges, branch instructions present other design challenges. For example, some branch instructions may be provided for use as procedure (or subroutine) calls. Such instructions provide a target address (the beginning of the desired subroutine), usually generated as the sum of one or more operands of the branch instruction, and also cause the address to which the procedure should return (the “return address”) to be stored. A branch instruction at the end of the procedure may read the stored return address and branch to the return address. Accordingly, execution of branch instructions which store the return address requires two address calculations (one for the target address, and one for the return address). Additionally, the return address may be generated at or near the time of fetching the procedure call instruction for storage in a return stack prediction structure.
Another example of a challenging branch instruction is the conditional branch instruction. A conditional branch instruction is dependent on a condition code (generated during execution of a preceding instruction) and either branches to the target address specified by one or more operands of the branch instruction (“taken”) or continues execution with the sequential instructions (“not taken”). Such instructions may also require two address calculations (for the target address and sequential address).
Performing two address calculations during execution of a branch instruction may be inefficient, in terms of hardware employed or in terms of the number of cycles of execution if sufficient hardware is not supplied to perform the address calculations in parallel. Furthermore, processors which employ branch prediction to predict the behavior of branch instructions (and thus to provide for speculative fetching and/or execution of the instructions subsequent to the branch instructions) may be subject to similar inefficiencies.