Predicting branches of instructions to be fetched in a processor may increase the speed, efficiency and performance of pipelined and superpipelined processors. Some branch prediction units (BPU) may require at least two clock cycles to generate a branch prediction and deliver a predicted branch target to an instruction fetch unit (IFU). An IFU which is to receive branch predictions for the lines that it fetches may be capable of fetching a line in each clock cycle. The number of instructions in such line may be variable. In the absence of an available prediction from a BPU, an IFU may fetch a next sequential line on the assumption that there was no branch from the prior line. If such assumption proves wrong, the next sequential line that was fetched and all instructions in it may be killed or flushed. The wasted fetch of an unneeded line may be called a bubble. Bubbles may decrease the efficiency of a processor.
Some BPU's may generate branch predictions in one cycle. In some BPU's 10 the period required to generate branch predictions may be two cycles or more making the BPU's throughput greater than 1. In some BPU's 10 increasing throughput may require adding a port to a cache of a predictor. Adding such a port may increase the cost of a processor.
In some processors, a BPU and an IFU may share an instruction pointer such that the BPU may generate predictions only on the same address or line for which the IFU is then fetching an instruction.