In the MIPS instruction set, the most important conditional branch instruction includes a 64-bit comparison. Very often the comparison depends on results that were computed or loaded from memory by instructions immediately preceding the branch. The direct approach bypasses the computed or loaded results, does a 64-bit comparison, and uses the comparison result to select either the branch target address or the sequential instruction address. Unfortunately, the delay of this result, bypass, compare, and address select path limits the maximum clock frequency and thus the maximum performance of the CPU.
Many branch prediction methods have been proposed or implemented. In general, these prediction methods are all designed for use in deep pipelines that have high misprediction penalties. The high mispredict penalty requires complex prediction logic to achieve the high prediction accuracy needed for acceptable performance. What is desired is an approach and architecture for a shallow pipelined where the penalty for misprediction is low.