1. Technical Field
The present specification relates in general to a method and system for data processing and, in particular, to a processor and method for speculatively executing a branch instruction. Still more particularly, the present specification relates to a processor and method for speculatively executing a branch instruction, wherein the processor includes a selectively configurable branch prediction unit.
2. Description of the Related Art
A state-of-the-art superscalar processor can comprise, for example, an instruction cache for storing instructions, an instruction buffer for temporarily storing instructions fetched from the instruction cache for execution, one or more execution units for executing sequential instructions, a branch processing unit (BPU) for executing branch instructions, a dispatch unit for dispatching sequential instructions from the instruction buffer to particular execution units, and a completion buffer for temporarily storing sequential instructions that have finished execution, but have not completed.
Branch instructions executed by the branch processing unit (BPU) of the superscalar processor can be classified as either conditional or unconditional branch instructions. Unconditional branch instructions are branch instructions that change the flow of program execution from a sequential execution path to a specified target execution path and which do not depend upon a condition supplied by the occurence of an event. Thus, the branch specified by an unconditional branch instruction is always taken. In contrast, conditional branch instructions are branch instructions for which the indicated branch in program flow may be taken or not taken depending upon a condition within the processor, for example, the state of specified condition register bits or the value of a counter. Conditional branch instructions can be further classified as either resolved or unresolved, based upon whether or not the condition upon which the branch depends is available when the conditional branch instruction is evaluated by the branch processing unit (BPU). Because the condition upon which a resolved conditional branch instruction depends is known prior to execution, resolved conditional branch instructions can typically be executed and instructions within the target execution path fetched with little or no delay in the execution of sequential instructions. Unresolved conditional branches, on the other hand, can create significant performance penalties if fetching of sequential instructions is delayed until the condition upon which the branch depends becomes available and the branch is resolved.
Therefore, in order to minimize execution stalls, some processors speculatively execute unresolved branch instructions by predicting whether or not the indicated branch will be taken. Utilizing the result of the prediction, the fetcher is then able to fetch instructions within the speculative execution path prior to the resolution of the branch, thereby avoiding a stall in the execution pipeline in cases in which the branch is subsequently resolved as correctly predicted. Conventionally, prediction of unresolved conditional branch instructions has been accomplished utilizing static branch prediction, which predicts resolutions of branch instructions based upon criteria determined prior to program execution, or dynamic branch prediction, which predicts resolutions of branch instructions by reference to branch history accumulated on a per-address basis within a branch history table. While conventional static and dynamic branch prediction methodologies have reasonable prediction accuracies for some performance benchmarks, the severity of the performance penalty incurred upon misprediction in state-of-the-art processors having deep pipelines and high dispatch rates necessitates increased prediction accuracy.
In response to the need for improved prediction accuracy, several two-level branch prediction methodologies have been proposed. For example, in one two-level dynamic branch prediction scheme, the first level of branch history comprises the execution history of the last K branch instructions and the second level of branch history comprises the branch behavior of the last L occurrences of the specific pattern of the last K branch instructions. Utilizing such two-level branch prediction schemes can result in prediction accuracies as high as 98% for selected performance benchmarks if the amount of branch history maintained at the first and second levels is optimized for the selected performance benchmarks. However, predetermining the amount of branch history maintained at each level based upon the prediction accuracy achieved for particular performance benchmarks does not necessarily ensure adequate prediction accuracy for multiple programs exhibiting diverse branch behaviors. Furthermore, the storage cost of the theoretically optimal amount of branch history may be prohibitive.
Therefore, in order to achieve adequate branch prediction accuracy at a reasonable cost for programs exhibiting a variety of diverse branch behaviors, a configurable two-level branch prediction mechanism is needed. In particular, a two-level branch prediction mechanism is needed that is dynamically configurable.