1. Field of the Invention
Embodiments of this invention relate generally to computers, and, more particularly, to a method and apparatus for efficiently performing branch prediction operations to conserve power and area.
2. Description of Related Art
Program instructions for some processors (e.g., central processing units, graphics processing units, etc.; also referred to as microprocessors) are typically stored in sequential, addressable locations within a memory. When these instructions are processed, the instructions may be fetched from consecutive memory locations and stored in a cache commonly referred to as an instruction cache. The instructions may later be retrieved from the instruction cache and executed. Each time an instruction is fetched from memory, a next instruction pointer within the microprocessor may be updated so that it contains the address of the next instruction in the sequence. The next instruction in the sequence may commonly be referred to as the next sequential instruction pointer. Sequential instruction fetching, updating of the next instruction pointer and execution of sequential instructions, may continue linearly until an instruction, commonly referred to as a branch instruction, is encountered and taken.
A branch instruction is an instruction that causes subsequent instructions to be fetched from one of at least two addresses: a sequential address identifying an instruction stream beginning with instructions, which directly follow the branch instruction; or an address referred to as a “target address,” which identifies an instruction stream beginning at an arbitrary location in memory. A branch instruction, referred to as an “unconditional branch instruction,” always branches to the target address, while a branch instruction, referred to as a “conditional branch instruction,” may select either the sequential or the target address based on the outcome of a prior instruction.
To efficiently execute instructions, microprocessors may implement a mechanism, commonly referred to as a branch prediction mechanism. A branch prediction mechanism determines a predicted direction (“taken” or “not taken”) for an encountered branch instruction, allowing subsequent instruction fetching to continue along the predicted instruction stream indicated by the branch prediction. For example, if the branch prediction mechanism predicts that the branch instruction will be “taken,” then the next instruction fetched is located at the target address. If the branch mechanism predicts that the branch instruction will not be taken, then the next instruction fetched is sequential to the branch instruction.
If the predicted instruction stream is correct, then the number of instructions executed per clock cycle is advantageously increased. However, if the predicted instruction stream is incorrect (i.e., one or more branch instructions are predicted incorrectly), then the instructions from the incorrectly predicted instruction stream are discarded from the instruction processing pipeline and the other instruction stream is fetched. Therefore, the number of instructions executed per clock cycle is decreased.
There is an incentive to construct accurate, and presumably complex, branch prediction schemes. There is also an incentive to perform as much speculative execution as possible in order to avoid pipeline stalls and improve computer performance. However, as branch prediction schemes and speculative execution become more accurate and complex, the power and area consumed by implementing such schemes increases. While the performance enhancement offered by branch prediction and speculative execution is desirable in nearly any computer, the additional power and area consumption it entails is a drawback. For example, when running a portable computer on battery power, it may be more important to conserve power and area than to try to increase computational throughput.