1. Technical Field
Embodiments of the present invention generally relate to computers. More particularly, embodiments relate to branch prediction in computer processing architectures.
2. Discussion
In the computer industry, the demand for higher processing speeds is well documented. While such a trend is highly desirable to consumers, it presents a number of challenges to industry participants. A particular area of concern is branch prediction.
Modem day computer processors are organized into one or more “pipelines,” where a pipeline is a sequence of functional units (or “stages”) that processes instructions in several steps. Each functional unit takes inputs and produces outputs, which are stored in an output buffer associated with the stage. One stage's output buffer is typically the next stage's input buffer. Such an arrangement allows all of the stages to work in parallel and therefore yields greater throughput than if each instruction had to pass through the entire pipeline before the next instruction could enter the pipeline. Unfortunately, it is not always apparent which instruction should be fed into the pipeline next, because many instructions have conditional branches.
When a computer processor encounters instructions that have conditional branches, branch prediction is used to eliminate the need to wait for the outcome of the conditional branch instruction and therefore keep the processor pipeline as full as possible. Thus, a branch prediction architecture predicts whether the branch will be taken and retrieves the predicted instruction rather than waiting for the current instruction to be executed. Indeed, it has been determined that branch prediction is one of the most important contributors to processor performance.
One approach to branch prediction involves a bimodal predictor, which generates a local prediction for a branch instruction, and a global predictor, which generates a global prediction for the branch instruction. The bimodal predictor predicts whether the branch will be taken based on the instruction address of the branch instruction and the state of an n-bit counter assigned to the branch instruction. The global predictor predicts whether the branch will be taken according to an index or “stew”, which is based on the instruction address and information from a global branch history, where the global predictor is used because branch instructions sometimes have the tendency to correlate to other nearby instructions. The length of the global branch history determines how much correlation can be captured by the global predictor.
While the bimodal/global (BG) approach provides substantial improvement over strict bimodal prediction, there remains considerable room for improvement. For example, the extent to which global prediction is helpful in accounting for correlation depends upon the type of application being run. For example, certain applications have code in which branch instructions correlate to instructions that are in relatively close proximity, whereas other applications have code in which branch instructions correlate to instructions that are farther away. As a result, certain types of code benefit from a shorter global branch history, while other types of code benefit from a longer global branch history. There is therefore a need for a branch prediction approach that provides for more flexible global branch prediction.