The disclosed subject matter relates gene ally to branch prediction in a computer system and, more particularly, to forwarding table updates to pending branch predictions.
Program instructions for a processor are typically stored in sequential, addressable locations within a memory. When these instructions are processed, they may be fetched from consecutive memory locations and stored in a cache commonly referred to as an instruction cache. The instructions may later be retrieved from the instruction cache and executed. Each the an instruction fetched from memory, a pointer within the processor may be updated so that it contains the address of the next instruction in the sequence. The instruction is the sequence may commonly be referred to as the next sequential instruction pointer. Sequential instruction fetching, updating of the next instruction pointer and execution of sequential instructions, may continue linearly until an instruction, commonly referred to as a branch instruction, is encountered and taken.
A branch instruction is an instruction that causes subsequent instructions to be fetched from one of at least two addresses: a sequential address identifying an instruction stream beginning with instructions, which directly follow the branch instruction; or an address referred to as a “target address,” which identifies an instruction stream beginning at an arbitrary location in memory. A branch instruction, referred to as an “unconditional branch instruction,” always branches to the target address, while a branch instruction, referred to as a “conditional branch instruction,” may select either the sequential or the target address based on the outcome of a prior instruction.
To efficiently execute instructions, processors may implement a mechanism, commonly referred to as a branch prediction mechanism. A branch prediction mechanism determines a predicted direction (“taken” or “not taken”) for an encountered branch instruction, allowing subsequent instruction fetching to continue along the predicted instruction stream indicated by the branch prediction. For example, if the branch prediction mechanism predicts that the branch instruction will be “taken,” then the next instruction fetched is located at the target address, if the branch mechanism predicts that the branch instruction will not be taken, then the next instruction fetched is sequential to the branch instruction.
If the predicted instruction stream is correct, then the number of instructions executed per clock cycle is advantageously increased. However, if the predicted instruction stream is incorrect (i.e., one or more branch instructions are predicted incorrectly), then the instructions from the incorrectly predicted instruction stream are discarded from the instruction processing pipeline and the other instruction stream is fetched. Therefore, the number of instructions executed per clock cycle is decreased.
There is an incentive to construct accurate branch prediction schemes to avoid pipeline stalls and improve computer performance. Those skilled in the art will appreciate that the branch prediction mechanism is more effective when it has up-to-date information from which to make a decision regarding whether a branch instruction will be “taken” or “not taken.” Accordingly, it is useful to update the branch prediction mechanism with information regarding whether the prediction proved accurate as each branch instruction is retired. This up-to-date information may then be used to make future branch predictions more accurate. However, because the instruction stream is being fetched well in advance, there may be numerous branch instructions that are still pending that used the now out-of-date information to make a prediction. Accordingly, these still pending branch instructions may be less accurately predicted than the current information would permit and now contain out-of-date information, which can disadvantageously cause the branch prediction scheme to operate less effectively. This can also cause the branches that were predicted with out-of-date information to update the predictors incorrectly, which can cause the predictors to frequently fail to “lock on” to branch outcome patterns that in theory the prediction algorithm should be able to predict with high accuracy.