1. Field of the Invention
The present invention generally relates to the field of microprocessors. More specifically, the present invention relates to an apparatus and a method for performing branch predictions in a microprocessor.
2. Description of Related Art
State-of-the-art microprocessors often employ pipelining to enhance performance. Within a pipelined microprocessor, functional units necessary for executing different stages of an instruction are operated simultaneously on multiple instructions to achieve a degree of parallelism leading to performance increases over non-pipelined microprocessors. As an example, an instruction fetch unit, an instruction decode unit and an instruction execution unit of a pipelined microprocessor may operate simultaneously. During one clock cycle, the instruction execution unit executes a first instruction while the instruction decode unit decodes a second instruction and the fetch unit fetches a third instruction. During a next clock cycle, the execution unit executes the newly decoded instruction while the instruction decode unit decodes the newly fetched instruction and the fetch unit fetches yet another instruction. In this manner, neither the fetch unit nor the decode unit need wait for the execution unit to execute the last instruction before processing new instructions. In state-of-the-art microprocessors, the steps necessary to fetch and execute an instruction are sub-divided into a larger number of stages to achieve a deeper degree of pipelining.
A pipelined microprocessor operates most efficiently when instructions are executed in the sequence in which the instructions appear in memory. However, such is typically not the case. Rather, computer programs typically include a large number of branch instructions which, upon execution, may cause instructions to be executed in a sequence other than as the sequence set forth in memory. More specifically, when a branch instruction is executed, execution continues either with the next sequential instruction from memory or execution jumps to an instruction specified at a "branch target" address. The branch specified by the instruction is said to be "Taken" if execution jumps and "Not Taken" if execution continues with the next sequential instruction from memory.
Branch instructions are either unconditional or conditional. An unconditional branch is Taken every time the instruction is executed. A conditional branch instruction is Taken or Not Taken depending upon resolution of a condition such as a logic statement. Instructions to be executed following a conditional branch are not known with certainty until the condition upon which the branch depends has been resolved. However, rather than wait until the condition is resolved, state-of-the-art microprocessors perform branch prediction whereby the microprocessor tries to determine whether the branch will be Taken or Not Taken and, if Taken, to predict or otherwise determine the target address for the branch. If the branch is predicted to be Not Taken, the microprocessor fetches and speculatively executes the next instruction in memory. If the branch is predicted to be Taken, the microprocessor fetches and speculatively executes the instruction found at the predicted branch target address. The instructions executed following the branch prediction are "speculative" because the microprocessor does not yet know whether the prediction will be correct or not. Accordingly, any operations caused to be performed by the speculative instructions cannot be fully completed. For example, if a memory write operation is performed speculatively, the write operation cannot be forwarded to external memory until all previous branch conditions are resolved otherwise the instruction may improperly alter the contents of the memory based on a mispredicted branch.
If the branch prediction is ultimately determined to be correct, the speculatively executed instructions are retired or otherwise committed. For the example of a memory write retired, the write operation is forwarded to external memory. If the branch prediction is ultimately found to be incorrect, then any speculatively executed instructions following the mispredicted branch are typically flushed from the system. For the memory write example, the write is not forwarded to external memory but rather is discarded.
To expedite branch prediction, some state-of-the-art microprocessors include a branch prediction table (BPT) which provides a cache of the most recently predicted branches along with corresponding prediction information such as a brief history of previous executions and/or predictions for that branch and the success thereof. In one embodiment of a pipelined microprocessor that has a branch prediction circuit that incorporates a BPT, an Instruction Pointer Generator (IPG) generates an instruction pointer specifying a new instruction to BPT and to an Instruction Cache (IC) which stores instructions. In many implementations of pipeline microprocessors, the BPT is accessed using a tag value provided within the instruction pointer and corresponding to tag values employed by the IC for identifying cache lines therein.
The branch prediction circuit may also include a Target Address Cache (TAC) which includes addresses of target instructions where a branch may be taken if the BPT predicts that a branch is taken. The IPG also drives the instruction pointer to the TAC. If a branch is determined to be taken, the TAC drives an address of a target instruction (target address) to the instruction pointer generator (IPG). The target address driven by the TAC to the IPG is decoded and then used as a regular instruction pointer to fetch a corresponding target instruction from the IC. In such a way, a fetch unit that may include the IC, restarts fetching instructions from the new target address.
However, for microprocessors operating at higher speeds "pipeline bubbles" may occur due to the fact that the speed of the microprocessor does not afford the fetching of a target instruction in the next pipeline stage after the issuance of the instruction pointer and the propagation time of the target address through the IPG. In the example described herein, the instruction pointer is generated in the first stage and a result of a determination of "branch taken" is generated during the second stage due to the time it takes to determine whether a branch is taken. A "pipeline bubble" is defined as a pipeline stage during which the result of the operation of a functional unit of the pipeline is later discarded (flushed) by the microprocessor as not being meaningful.
A "pipeline bubble," for example, may occur if the IPG does not timely receive information from the BPT, indicating that a branch is taken, before the beginning of the second pipeline stage to redirect the instruction stream to the target instruction in the second pipeline stage. In conventional microprocessors, the IPG generates a new instruction pointer in the second pipeline stage, by incrementing the previous instruction pointer. The instruction pointer generated by the IPG in the second pipeline stage is meaningless for the sequence of instructions the microprocessor executes, as the correct instruction's pointer should point to the target instruction and not to an instruction following the instruction the pointer therewith was generated in the first stage. A "pipeline bubble" thus occurs at the instruction pointer generation stage in the second stage. In the third pipeline stage, the bubble propagates to the IC as the instruction fetched from the IC in the third pipeline stage, most likely, is not the target instruction. Only in a fourth pipeline stage a target instruction fetched from the IC is made available to a pipeline execution circuit.
The introduction of a one-cycle "pipeline bubble" into the pipeline causes a one-cycle penalty for every branch correctly predicted taken. Hence, it is desirable to provide a method and apparatus for providing a target instruction to an execution pipeline circuit without incurring a one-cycle penalty for every branch correctly predicted taken due to the introduction of a pipeline "bubble" in the pipeline stages when a branch is predicted taken.