Modern microprocessors include instruction pipelines in order to increase program execution speeds. Instruction pipelines typically include a number of units, each unit operating in cooperation with other units in the pipeline. One exemplary pipeline, found in, for example, Intel's Pentium.RTM. Pro microprocessor, includes an instruction fetch unit (IFU), an instruction decode unit (ID), an allocation unit (ALLOC), an instruction execution unit (EX) and a write back unit (WB). The instruction fetch unit fetches program instructions, the instruction decode unit translates the instructions into micro-ops, the allocation unit assigns a sequence number to each micro-op, the execution unit executes the micro-ops, and the write back unit retires instructions.
In this exemplary pipeline arrangement, the instruction fetch unit fetches instructions, while the other units operate on previously fetched instructions. In order for a pipelined microprocessor to operate efficiently, the instruction fetch unit continually provides the pipeline with a stream of instructions.
Certain types of instructions may cause the instruction fetch unit to stall until a unit further downstream in the pipeline fully resolves the instruction. For example, it is not known whether a conditional branch instruction will be taken or not taken until the branch condition is fully resolved. Accordingly, after the instruction fetch unit fetches such a conditional branch instruction, the instruction fetch unit does not know whether the next required instruction is the next sequential program instruction, or the instruction at the branch target address of the conditional branch instruction. If the instruction fetch unit were required to wait until the branch condition is fully resolved, i.e., after the instruction is executed, the instruction fetch unit would stall. Accordingly, modem microprocessor instruction pipelines include prediction circuitry for predicting whether or not such a branch instruction will be taken.
In Intel's Pentium.RTM. Pro microprocessor, for example, the instruction pipeline includes prediction circuitry, i.e., a branch target buffer (BTB), that predicts whether or not a branch instruction will be taken or not taken based on the history of the branch instruction. Exemplary embodiments of the branch target buffer are described in detail in U.S. Pat. Nos. 5,574,871 to Hoyt et al., 5,577,217 to Hoyt et al., 5,584,001 to Hoyt et al.
Certain types of branch instructions are associated with program calls to subroutines. A typical program calls a subroutine by issuing a CALL instruction, explicitly citing the address of the subroutine to which the program should branch. The subroutine then typically ends with a RETURN FROM SUBROUTINE instruction, which causes the program to branch back to the program that made the call. This return address is not explicitly cited. However, when the CALL instruction associated with the RETURN FROM SUBROUTINE instruction is executed, the address of the next sequential instruction (relative to the CALL instruction) is pushed onto a branch prediction stack, i.e., a real return stack buffer (RRSB). When the RETURN FROM SUBROUTINE instruction is retired, the RRSB is "popped" (e.g., removing the top entry from the stack, or invalidating the entry from the stack and incrementing a pointer), thereby providing the processor, and the instruction fetch unit in particular, with the appropriate return address.
Because an instruction pipeline is typically several instructions deep, it is possible that the instruction fetch unit will need the return address in order to fetch the next instruction before the CALL instruction is actually executed and retired, particularly in the case of a short subroutine. For example, if a subroutine is only 10 instruction long, but the instruction pipeline is 20 instructions deep, the instruction fetch unit will need the return address before the CALL instruction even gets to the execution unit and write back unit. Accordingly, a processor may include a "speculative" return stack buffer (SRSB). In the Intel Pentium.RTM. Pro processor, for example, the instruction decode unit maintains a speculative return stack buffer into which a return address for each CALL instruction the instruction decode unit detects is pushed. This return stack buffer is considered "speculative" because it may include return addresses for CALL instructions that are actually never executed. For example, if the CALL instruction in within a program path following a conditional branch instruction upon which the branch target buffer predicts branch direction, i.e., taken or not taken, the CALL instruction may never actually be executed if the branch direction of the conditional branch instruction was incorrectly predicted. A detailed description of an exemplary real return stack buffer and speculative return stack buffer are provided in U.S. Pat. No. 5,604,877 to Hoyt et al. In such a system, if a conditional branch instruction is incorrectly predicted, the pipeline must be restarted at the instruction fetch unit. That is, the instructions in the instruction pipeline following the branch instruction must be flushed, including intervening CALL and RETURN FROM SUBROUTINE instructions. Additionally, the instruction fetch unit must begin fetching instructions at the proper instruction address. Thus, the entries in the speculative return stack buffer are incorrect, and are marked invalid.
In the Intel Pentium.RTM. With MMX.TM. Technology microprocessor, the real return stack buffer and the speculative return stack buffer are maintained in the same structure. Thus, if the entires in the speculative return stack buffers are marked invalid, units in the instruction pipeline may instead use the entries in the real return stack buffer.
The above-described systems work reasonably well in a instruction pipeline having a single pipeline restart point, and a single instruction source (i.e., the instruction fetch unit). However, in a system that includes multiple restart points and multiple instruction sources, modifications are needed to provide the instruction fetch unit with current and reasonably accurate information.