1. Technical Field
Embodiments described herein generally relate to processors. In particular, embodiments described herein generally relate to the handling of conditional branches in processors.
2. Background Information
Certain processors use pipelined execution to overlap execution phases. This may allow multiple instructions to be in different phases of execution at the same time, which may help to improve performance. The amount of parallelism achieved tends to increase as the pipeline depth increases. Over time, certain processors have incorporated ever deeper pipelines in an attempt to improve performance. These deep pipelines tend to be more effective when the instruction stream is known so that the pipeline can be kept full and the execution of subsequent instructions do not need to wait on the results of the previous instructions in the pipeline.
One challenge is that programs or code executed by processors typically contain conditional branches. Examples of such conditional branches include “jump if condition is or is not met” type of instructions, and other conditional control flow changing instructions known in the arts. The conditional branches may cause the flow of execution to branch conditionally in one of two possible directions. These two directions are often called a “taken path” and a “not taken path”. The “not taken path” commonly leads to the next sequential instruction in the code being executed, whereas the “taken path” commonly jumps, moves, or branches over one or more intervening instructions to a non-sequential target instruction. Whether the branches are taken or not taken generally depends upon the evaluation of conditions associated with the instructions (e.g., whether or not the conditions are met).
To help improve performance, most modern processors have branch predictors to help predict the directions of the conditional branches before the actual directions of the conditional branches have been determined. Generally, the actual directions of the conditional branches are not known definitively until the condition has actually been evaluated at a subsequent stage of the pipeline. However, the branch predictors may employ a branch prediction mechanism or logic to predict the directions of the conditional branches (e.g., based on past history). This may help to improve processor performance. Without the branch predictors, the processor might have to wait for the evaluation of the conditions associated with the conditional branch instructions before it could fetch additional instructions into the pipeline. However, the branch predictor may help to avoid such wasted time by predicting the most likely direction of the conditional branch. The predicted branch direction may then be used to fetch additional instructions and execute them speculatively.
Ultimately the predicted branch direction will turn out either to be correct or incorrect. If the predicted branch direction turns out to be correct, then the results and/or state of the speculatively executed instructions may be utilized. In this case, the performance and speed of the processor will generally have been increased due to greater utilization of pipeline stages that would otherwise have been dormant, or at least underutilized, while waiting for the evaluation of the actual direction of the conditional branch. However, if instead the predicted branch direction turns out to be incorrect (e.g., was miss-predicted by the branch predictor), then any results and/or state from the instructions speculatively executed beyond the conditional branch instruction will typically need to be discarded. Often, the pipeline will be flushed (discarding instructions currently in flight in the pipeline) and the execution will be rewound back to the conditional branch that was miss-predicted and restarted with the alternate now correctly known branch direction. This outcome is generally undesirable, since it tends to incur both a performance penalty and an energy penalty.