1. Field of the Invention
This invention is related to the field of superscalar microprocessors and, more particularly, to reorder buffers having future files.
2. Description of the Relevant Art
Superscalar microprocessors achieve high performance by executing multiple instructions per clock cycle and by choosing the shortest possible clock cycle consistent with the design. As used herein, the term "clock cycle" refers to an interval of time accorded to various stages of an instruction processing pipeline within the microprocessor. Storage devices (e.g. registers and arrays) capture their values according to the clock cycle. For example, a storage device may capture a value according to a rising or falling edge of a clock signal defining the clock cycle. The storage device then stores the value until the subsequent rising or falling edge of the clock signal, respectively. The term "instruction processing pipeline" is used herein to refer to the logic circuits employed to process instructions in a pipelined fashion. Although the pipeline may be divided into any number of stages at which portions of instruction processing are performed, instruction processing generally comprises fetching the instruction, decoding the instruction, executing the instruction, and storing the execution results in the destination identified by the instruction.
In order to increase performance, superscalar microprocessors often employ out of order execution. The instructions within a program are ordered, such that a first instruction is intended to be executed before a second instruction, etc. When the instructions are executed in the order specified, the intended functionality of the program is realized. However, instructions may be executed in any order as long as the original functionality is maintained. For example, a second instruction which does not depend upon a first instruction may be executed prior to the first instruction, even if the first instruction is prior to the second instruction in program order. A second instruction depends upon a first instruction if a result produced by the first instruction is employed as an operand of the second instruction. The second instruction is said to have a dependency upon the first instruction.
Another hazard of out of order execution occurs when two instructions update the same destination storage location. If the instruction which is second in the original program sequence executes first, then that instruction must not update the destination until the first instruction has executed. Often, superscalar microprocessors employ a reorder buffer in order to correctly handle dependency checking and multiple updates to a destination, among other things. Instructions are stored into the reorder buffer in program order, typically as the instructions are dispatched to execution units (perhaps being stored in reservation stations associated therewith). The results of the instructions are stored into the destinations from the reorder buffer in program order. However, results may be provided to the reorder buffer in any order. The reorder buffer stores each result with the instruction which generated the result until that instruction is selected for storing its result into the destination.
A reorder buffer is configured to store a finite number of instructions, defining a maximum number of instructions which may be concurrently outstanding within the superscalar microprocessor. Generally speaking, out of order execution occurs more frequently as the finite number is increased. For example, the execution of an instruction which is foremost within the reorder buffer in program order may be delayed. Instructions subsequently dispatched into the reorder buffer which are not dependent upon the delayed instruction may execute and store results in the buffer. Out of order execution may continue until the reorder buffer becomes full, at which point dispatch is suspended until instructions are deleted from the reorder buffer. Therefore, a larger number of storage locations within the reorder buffer generally leads to increased performance by allowing more instructions to be outstanding before instruction dispatch (and out of order execution) stalls.
Unfortunately, larger reorder buffers complicate dependency checking. One or more source operands of an instruction to be dispatched may be destination operands of outstanding instructions within the reorder buffer. As used herein, a source operand of an instruction is a value to be operated upon by the instruction in order to produce a result. Conversely, a destination operand is the result of the instruction. Source and destination operands of an instruction are generally referred to as operand information. An instruction specifies the location storing the source operands and the location in which to store the destination operand. An operand may be stored in a register (a "register operand") or a memory location (a "memory operand"). As used herein, a register is a storage location included within the microprocessor which is used to store instruction results. Registers may be specified as source or destination storage locations for an instruction.
The locations from which to retrieve source operands for an instruction to be dispatched are compared to the locations designated for storing destination operands of instructions stored within the reorder buffer. If a dependency is detected and the corresponding instruction has executed, the result stored in the reorder buffer may be forwarded for use by the dispatching instruction. If the instruction has not yet executed, a tag identifying the instruction may be forwarded such that the result may be provided when the instruction is executed. This tag is known as a "reorder buffer tag."
When the number of instructions storable in the reorder buffer is large, the number of comparisons for performing dependency checking is also large. Generally speaking, the total number of comparisons which must be provided for is the number of possible operands of an instruction multiplied by the number of instructions which may be concurrently dispatched, further multiplied by the number of instructions which may be stored in the reorder buffer.
One particular implementation of a reorder buffer employs a future file to reduce dependency checking complexity. The future file replaces the large block of comparators and prioritization logic ordinarily employed by reorder buffers for dependency checking. The future file includes a number of storage locations, each corresponding to a particular register implemented within the microprocessor. These storage locations are designed to store predicted future states of the microprocessor's registers for use by instructions that are executing out of order. A storage location can store either the reorder buffer tag or instruction result, if the instruction has executed, of the last instruction in program order to update the register that corresponds to that particular storage location. The reorder buffer provides the value (either reorder buffer tag or instruction result) to be stored in the storage location corresponding to a register when the register is used as a source operand for another instruction.
While future files improve the efficiency of dependency checking, they too have drawbacks. In particular, traditional implementations of reorder buffers utilizing future files suffer from slow recovery times for branch mispredictions. This is because the entire future file must be recovered after every branch misprediction.
Furthermore, while future files reduce the number of comparators used (relative to reorder buffers without future files), they still require a large number of comparators to implement dependency checking. Thus, a reorder buffer having a future file using even fewer comparators and capable of faster branch misprediction recovery is desired.