Modern microprocessors are pipelined microprocessors. That is, they operate on several instructions at the same time, within different blocks or pipeline stages of the microprocessor. Hennessy and Patterson define pipelining as, “an implementation technique whereby multiple instructions are overlapped in execution.” Computer Architecture: A Quantitative Approach, 2nd edition, by John L. Hennessy and David A. Patterson, Morgan Kaufmann Publishers, San Francisco, Calif., 1996. They go on to provide the following excellent illustration of pipelining:                A pipeline is like an assembly line. In an automobile assembly line, there are many steps, each contributing something to the construction of the car. Each step operates in parallel with the other steps, though on a different car. In a computer pipeline, each step in the pipeline completes a part of an instruction. Like the assembly line, different steps are completing different parts of the different instructions in parallel. Each of these steps is called a pipe stage or a pipe segment. The stages are connected one to the next to form a pipe—instructions enter at one end, progress through the stages, and exit at the other end, just as cars would in an assembly line.        
Synchronous microprocessors operate according to clock cycles. Typically, an instruction passes from one stage of the microprocessor pipeline to another each clock cycle. In an automobile assembly line, if the workers in one stage of the line are left standing idle because they do not have a car to work on, then the production, or performance, of the line is diminished. Similarly, if a microprocessor stage is idle during a clock cycle because it does not have an instruction to operate on—a situation commonly referred to as a pipeline bubble—then the performance of the processor is diminished.
A potential cause of pipeline bubbles is branch instructions. When a branch instruction is encountered, the processor must determine the target address of the branch instruction and begin fetching instructions at the target address rather than the next sequential address after the branch instruction. Furthermore, if the branch instruction is a conditional branch instruction (i.e., a branch that may be taken or not taken depending upon the presence or absence of a specified condition), the processor must decide whether the branch instruction will be taken, in addition to determining the target address. Because the pipeline stage that ultimately resolves the target address and/or branch outcome (i.e., whether the branch will be taken or not taken) is typically many stages below the stage that fetches the instructions, bubbles may be created.
To address this problem, modern microprocessors typically employ branch prediction mechanisms to predict the target address and branch outcome early in the pipeline. Microprocessor designers are continually striving to design branch predictors with greater prediction accuracy. However, branch predictors mispredict branch instruction outcomes a non-trivial percentage of the time. As alluded to above, the mispredictions must be detected and corrected in a subsequent stage of the pipeline below the branch prediction stage. The penalty associated with the misprediction is a function of the number of pipeline stages between the branch predictor and the branch misprediction correction stage. Therefore, what is needed is an apparatus and method for correcting conditional branch instruction mispredictions earlier in the pipeline.
Furthermore, conditional branch instructions specify a branch condition which, if satisfied, instructs the microprocessor to branch to the branch target address; otherwise, the microprocessor continues to fetch the next sequential instruction. The microprocessor includes status flags that store state of the microprocessor. The status flags are examined to determine whether the condition specified by the conditional branch instruction is satisfied. Thus, in order to finally determine whether a conditional branch instruction has been mispredicted, the microprocessor must examine the most current state of the status flags. However, currently, it is not until late in the pipeline that the status flags are examined in order to determine whether the branch condition is satisfied and whether the branch prediction was incorrect. Therefore, what is needed is an apparatus and method for generating the status flags earlier in the pipeline.
Finally, the state of the status flags is typically affected by the results of instructions preceding the conditional branch instruction. For example, the condition may be whether the carry flag, which is one of the status flags, is set. The state of the carry flag may be determined by the most recent add instruction result, for example. However, the results of instructions that affect the status flags are currently generated in execution units located in lower pipeline stages of the microprocessor. Therefore, what is needed is an apparatus and method for generating instruction results earlier in the pipeline.