The present invention relates to executions in a processor and more specifically to controlling the accuracy and stream threshold during classification of branches to increase the efficiency of a processor executing branch instructions.
Modern computer systems typically contain several integrated circuits (ICs), including a processor which may be used to process information in the computer system. The data processed by a processor may include computer instructions which are executed by the processor as well as data which is manipulated by the processor using the computer instructions. The computer instructions and data are typically stored in a main memory in the computer system.
Processors typically process instructions by executing the instruction in a series of small steps. In some cases, to increase the number of instructions being processed by the processor (and therefore increase the speed of the processor), the processor may be pipelined. Pipelining refers to providing separate stages in a processor where each stage performs one or more of the small steps necessary to execute an instruction, i.e., several instructions are overlapped in execution. In some cases, the pipeline (in addition to other circuitry) may be placed in a portion of the processor referred to as the processor core. Some processors may have multiple processor cores, and in some cases, each processor core may have multiple pipelines. Where a processor core has multiple pipelines, groups of instructions (referred to as issue groups) may be issued to the multiple pipelines in parallel and executed by each of the pipelines in parallel.
Branch instructions (or “branch”) can be either unconditional, meaning that the branch is taken every time that the instruction is encountered in the program, or conditional, meaning that the branch is either taken or not taken, depending upon a condition. Processors typically provide conditional branch instructions which allow a computer program to branch from one instruction to a target instruction (thereby skipping intermediate instructions, if any) if a condition is satisfied. If the condition is not satisfied, the next instruction after the branch instruction may be executed without branching to the target instruction. Most often, the instructions to be executed following a conditional branch are not known with certainty until the condition upon which the branch depends has been resolved. These types of branches can significantly reduce the performance of a pipeline processor since they may interrupt the steady supply of instructions to the execution hardware. Branch predictors attempt to predict the outcome of conditional branch instructions in a program before the branch instruction is executed. If a branch is mispredicted, all of the speculative work, beyond the point in the program where the branch is encountered, must be discarded. Therefore, a highly-accurate branch prediction mechanism is beneficial to a high-performance, pipelined processor where branch prediction may be used to predict the outcome of conditional branch instructions. For example, when a conditional branch instruction is encountered, the processor may predict which instruction will be executed after the outcome of the branch condition is known. Then, instead of stalling the pipeline when the conditional branch instruction is issued, the processor may continue issuing instructions beginning with the predicted next instruction.
Many early implementations of branch predictors used simple history bits and counter-based schemes that provide branch prediction accuracy of about 85-90%. Attempts to improve upon the accuracy of simple 2-bit counter schemes have included predictors that relate the sub-history information of a branch to the most recently executed branches via a shift register. Among the methods used to predict branches are local branch prediction and global branch prediction. Local branch prediction involves making a prediction based on the behavior of a particular branch the past few times it was executed. Local branch prediction is effective for branches exhibiting repetitive patterns. On the other hand, global branch prediction involves making a branch prediction based on the history of the last few branches to have been executed. Global branch prediction is useful when the behavior of a branch is related to the behavior of the prior executed branches.
While history-based dynamic branch predictors have reached high prediction accuracy, certain branch types continue to mispredict. These are branches that may depend on longer history length, have loaded data values or exhibit random behavior (e.g., multi-target indirect branches and data-dependent direct and indirect branches). These are hard-to-predict branches since their outcome do not always exhibit repeatable patterns and trying to predict the outcome of such branches using typical branch predictors result in bottlenecks and low-performance.
Classifying branches to identify such hard-to-predict branches (or other types of branches) and selecting a branch predictor based on the type of branch improves accuracy and performance. Existing methods for classifying branches into, for example, hard-to-predict and simple branches analyze the actual behavior and predicted behavior of a branch and compare the accuracy of branch prediction with a pre-defined threshold. However, such methods of comparing the accuracy of branch prediction to a pre-defined and fixed threshold do not take into account the workload of the processor, applications running on the processor, mis-prediction rate corresponding to such applications, and/or other micro-architectural aspects of the processor, and hence may introduce certain inefficiency.
A method for identifying hard-to-predict branches which adapts to the applications' branch behavior and dynamically tunes the threshold will further improve accuracy of branch classification and processor performance.