1. Technical Field
The present specification relates in general to a method and system for data processing and in particular to a processor and method for speculatively executing a branch instruction. Still more particularly, the present specification relates to a processor having a reduced branch prediction storage size and a method of branch prediction utilizing a compressed branch history.
2. Description of the Related Art
A state-of-the-art superscalar processor can comprise, for example, an instruction cache for storing instructions, an instruction buffer for temporarily storing instructions fetched from the instruction cache for execution, one or more execution units for executing sequential instructions, a branch processing unit (BPU) for executing branch instructions, a dispatch unit for dispatching sequential instructions from the instruction buffer to particular execution units, and a completion buffer for temporarily storing sequential instructions that have finished execution, but have not completed.
Branch instructions executed by the branch processing unit (BPU) of the superscalar processor can be classified as either conditional or unconditional branch instructions. Unconditional branch instructions are branch instructions that change the flow of program execution from a sequential execution path to a specified target execution path and which do not depend upon a condition supplied by the occurrence of an event. Thus, the branch in program flow specified by an unconditional branch instruction is always taken. In contrast, conditional branch instructions are branch instructions for which the indicated branch in program flow may be taken or not taken depending upon a condition within the processor, for example, the state of specified condition register bits or the value of a counter. Conditional branch instructions can be further classified as either resolved or unresolved, based upon whether or not the condition upon which the branch depends is available when the conditional branch instruction is evaluated by the branch processing unit (BPU). Because the condition upon which a resolved conditional branch instruction depends is known prior to execution, resolved conditional branch instructions can typically be executed and instructions within the target execution path fetched with little or no delay in the execution of sequential instructions. Unresolved conditional branches, on the other hand, can create significant performance penalties if fetching of sequential instructions is delayed until the condition upon which the branch depends becomes available and the branch is resolved.
Therefore, in order to minimize execution stalls, some processors speculatively execute unresolved branch instructions by predicting whether or not the indicated branch will be taken. Utilizing the result of the prediction, the fetcher is then able to speculatively fetch instructions within a target execution path prior to the resolution of the branch, thereby avoiding a stall in the execution pipeline in cases in which the branch is subsequently resolved as correctly predicted. Conventionally, prediction of unresolved conditional branch instructions has been accomplished utilizing static branch prediction, which predicts resolutions of branch instructions based upon criteria determined prior to program execution, or dynamic branch prediction, which predicts resolutions of branch instructions by reference to branch history accumulated on a per-address basis within a branch history table. While conventional static and dynamic branch prediction methodologies have reasonable prediction accuracies for some performance benchmarks, the severity of the performance penalty incurred upon misprediction in state-of-the-art processors having deep pipelines and high dispatch rates necessitates increased prediction accuracy.
In response to the need for improved prediction accuracy, several two-level branch prediction methodologies have been proposed. For example, one two-level dynamic branch prediction scheme includes a first level of branch history that specifies the resolutions of the last K branch instructions and a second level of branch prediction storage that associates a resolution prediction with each (or selected ones) of the 2.sup.K possible branch history patterns. Utilizing such two-level branch prediction schemes can result in high prediction accuracies (&gt;95%) for selected performance benchmarks if the amount of branch history maintained at the first and second levels is large. However, the storage costs associated with such two-level branch prediction schemes, and in particular with the branch prediction storage, can be prohibitive.
Therefore, in order to achieve reasonably high branch prediction accuracy at a reasonable cost, a two-level branch prediction mechanism is needed that reduces the size of the branch prediction storage.