Modern microprocessors are pipelined microprocessors. That is, they operate on several instructions at the same time, within different blocks or pipeline stages of the microprocessor. Hennessy and Patterson define pipelining as, “an implementation technique whereby multiple instructions are overlapped in execution.” Computer Architecture: A Quantitative Approach, 2nd edition, by John L. Hennessy and David A. Patterson, Morgan Kaufmann Publishers, San Francisco, Calif., 1996. They go on to provide the following excellent illustration of pipelining:
A pipeline is like an assembly line. In an automobile assembly line, there are many steps, each contributing something to the construction of the car. Each step operates in parallel with the other steps, though on a different car. In a computer pipeline, each step in the pipeline completes a part of an instruction. Like the assembly line, different steps are completing different parts of the different instructions in parallel. Each of these steps is called a pipe stage or a pipe segment. The stages are connected one to the next to form a pipe—instructions enter at one end, progress through the stages, and exit,at the other end, just as cars would in an assembly line.
Synchronous microprocessors operate according to clock cycles. Typically, an instruction passes from one stage of the microprocessor pipeline to another each clock cycle. In, an automobile assembly line, if the workers in one stage of the line are left standing idle because they do not have a car to work on, then the production, or performance, of the line is diminished. Similarly, if a microprocessor stage is idle during a clock cycle because it does not have an instruction to operate on—a situation commonly referred to as a pipeline bubble—then the performance of the processor is diminished.
One means commonly employed to avoid causing bubbles in the pipeline is to employ an instruction buffer, often arranged in a queue structure, between stages in the pipeline. An instruction buffer may provide elasticity for periods of time where the instruction processing rates vary between stages above and below the instruction buffer in the pipeline. For example, instruction buffering may be useful where execution stages of a pipeline (i.e., lower stages) require instructions to execute, but the instructions are not present in the instruction cache, which is in the upper portion of the pipeline. In this situation,, the impact of the missing cache line may be reduced to the extent an instruction buffer supplies instructions to the execution stages while the memory fetch is performed.
Another potential cause of pipeline bubbles is branch instructions. When a branch instruction is encountered, the processor must determine the target address of the branch instruction and begin fetching instructions at the target address rather than the next sequential address after the branch instruction. Furthermore, if the branch instruction is a conditional branch instruction (i.e., a branch that may be taken or not taken depending upon the presence or absence of a specified condition), the processor must decide whether the branch instruction will be taken, in addition to determining the target address. Because the pipeline stages that determine the target address and/or whether the branch instruction will be taken are typically well below the stages that fetch the instructions, bubbles may be created.
Although instruction buffering may reduce the number of bubbles, modern microprocessors also typically employ branch prediction mechanisms to predict the target address and/or whether the branch will be taken early in the pipeline to further reduce the problem. However, if the branch prediction turns out to be wrong, the instructions fetched as a result of the prediction, whether they were the next sequential instructions or the instructions at the target address, must not be executed by the processor or incorrect program execution will result.
Correcting for branch instruction mispredictions is one example of situations in which instructions fetched into a microprocessor must be killed, i.e., not executed by the pipeline. However, situations may exist in which the need to kill an instruction may not be determined until the instruction has already been written into an instruction buffer. Therefore, an efficient solution is needed for killing an instruction although it has already been written into an instruction buffer.