The present invention generally relates to branch prediction in microprocessors, and relates in particular to systems and methods for predicting branch misprediction in microprocessors.
For modern processors, high performance generally requires high branch prediction accuracy. Microprocessors typically use complex processing paths that permit consecutive instructions to be processed at the same time. Branch prediction generally involves determining an expected flow of instructions when a microprocessor encounters a branch instruction. When the expected flow of instructions turns out to be incorrect (called branch misprediction), then the processing time is at best not improved (over serial processing), and at worst significantly negatively affected by the branch misprediction.
Since processors need to have branch predictors with high branch prediction accuracy to achieve good performance, computer designers have proposed branch predictors that are increasingly more complex and/or larger. Unfortunately, the cost of these more aggressive branch predictors include higher prediction latencies and misprediction penalties, which offset the higher prediction accuracy, in addition to requiring larger chip area and increased power consumption.
Typically, increased branch prediction accuracy therefore comes at the cost of increased complexity (e.g., more complex algorithms), chip area (e.g., larger tables), and power consumption. Unfortunately, due to longer training times, higher prediction latencies, and a higher misprediction penalty, a more complex and/or larger branch predictor may actually result in a net performance loss despite a higher prediction accuracy. Furthermore, due to increased access latencies, these branch predictors cannot be easily scaled to further improve the branch prediction accuracy.
It has been shown, for example in “Reconsidering Complex Branch Predictors,” International Symposium on High-Performance Computer Architecture, by D. Jiménez (2003), that using large and complex branch predictors is ultimately self-defeating if the processor does not hide the branch prediction latency. In other words, assuming a more complex/larger branch predictor has a higher branch prediction latency, if the branch prediction latency is on the critical path of the processor, then the performance of the processor may actually decrease, even if the branch prediction accuracy is higher. To address this problem, a 2-level pattern history table (PHT) similar in concept to a 2-level inclusive cache hierarchy has been proposed. In this approach, the branch predictor prefetches entries from the larger, second-level PHT to first-level PHT based on the latest global history. It is sometimes desirable, therefore, to try to predict branch misprediction.
Confidence estimation for branch prediction is a conventional prediction approach that involves assigning confidence values to branches. See “Assigning Confidence to Conditional Branch Predictions,” International Symposium on Microarchitecture, by E. Jacobsen, E. Rotenberg, and J. Smith (1996), which proposed adding confidence methods to branch predictors as a means of allocating and optimizing the processor's resources. The confidence that is assigned to a branch is based on the recent prediction history for that specific branch and the recent global history. The recent prediction history is fed into a reduction function that assigns a confidence value based on the number of correct branch predictions.
Accuracy, sensitivity, and specificity of four different confidence estimators were evaluated, for example, in “Confidence Estimation for Speculation Control,” International Symposium on Computer Architecture, by D. Grunwald, A. Klauser, S. Manne, and A. Pleszkun (1998). Their results showed that the performance of the confidence estimator depends on the branch predictor and confidence estimator having a similar indexing scheme. While they found that the approach of E. Jacobsen, E. Rotenberg, and J. Smith above had better performance than the other three, the other three require very little hardware. In “Branch Prediction using Selective Branch Inversion,” International Conference on Parallel Architectures and Compilation Techniques, S. Manne, A. Klauser, D. Grunwald, (1999), the use of up/down counters as confidence estimators was proposed, in which the prediction for low-confidence branch predictions is inverted. This approach is similar to the approach of E. Jacobsen, E. Rotenberg, and J. Smith above with the key difference that incorrect branch predictions decrement, rather than reset, the counter.
Finally, in “Dynamic Branch Prediction with Perceptrons,” International Symposium on High Performance Computer Architecture, by D. Jiménez and C. Lin, (2001), combining several confidence estimators together was proposed to produce a group confidence estimate. More specifically, they sum the confidence estimates of three different confidence estimators together to form a group confidence estimate which is then compared against a confidence threshold. If the group estimate exceeds the threshold, then the prediction is marked as confident. One key difference with the composite confidence estimator and previously proposed estimators is that the composite estimator is accurate even when the misprediction rate is low. It has been found, however, that using confidence estimators to change the branch prediction, after it has already been made, will increase the prediction latency of current branch predictors and potentially increase the number of front-end pipeline stages and, subsequently, the misprediction penalty, which could significantly decrease the performance gain due to higher branch prediction accuracies.
There remains a need, therefore, for an improved system for branch misprediction prediction.