1. Field of the Invention
This invention is related to the field of microprocessors and, more particularly, to dependency checking mechanisms within microprocessors.
2. Description of the Relevant Art
Superscalar microprocessors achieve high performance by executing multiple instructions per clock cycle and by choosing the shortest possible clock cycle consistent with the design. As used herein, the term "clock cycle" refers to an interval of time accorded to various stages of an instruction processing pipeline within the microprocessor. Storage devices (e.g. registers and arrays) capture their values according to the clock cycle. For example, a storage device may capture a value according to a rising or falling edge of a clock signal defining the clock cycle. The storage device then stores the value until the subsequent rising or falling edge of the clock signal, respectively. The term "instruction processing pipeline" is used herein to refer to the logic circuits employed to process instructions in a pipelined fashion. Although the pipeline may be divided into any number of stages at which portions of instruction processing are performed, instruction processing generally comprises fetching the instruction, decoding the instruction, executing the instruction, and storing the execution results in the destination identified by the instruction.
In order to increase performance, superscalar microprocessors often employ out of order execution. The instructions within a program are ordered, such that a first instruction is intended to be executed before a second instruction, etc. When the instructions are executed in the order specified, the intended functionality of the program is realized. However, instructions may be executed in any order as long as the original functionality is maintained. For example, a second instruction which does not depend upon a first instruction may be executed prior to the first instruction, even if the first instruction is prior to the second instruction in program order. A second instruction depends upon a first instruction if a result produced by the first instruction is employed as an operand of the second instruction. The second instruction is said to have a dependency upon the first instruction.
Another hazard of out of order execution occurs when two instructions update the same destination storage location. If the instruction which is second in the original program sequence executes first, then that instruction must not update the destination until the first instruction has executed. Often, superscalar microprocessors employ a reorder buffer in order to correctly handle multiple updates to a destination, among other things. Instructions are stored into the reorder buffer in program order, typically as the instructions are dispatched to execution units (perhaps being stored in reservation stations associated therewith). The results of the instructions are stored into the destinations from the reorder buffer in program order. However, results may be provided to the reorder buffer in any order. The reorder buffer stores each result with the instruction which generated the result until that instruction is selected for storing its result into the destination.
A reorder buffer is configured to store a finite number of instructions, defining a maximum number of instructions which may be concurrently outstanding within the superscalar microprocessor. Generally speaking, out of order execution occurs more frequently as the finite number is increased. For example, the execution of an instruction which is foremost within the reorder buffer in program order may be delayed. Instructions subsequently dispatched into the reorder buffer which are not dependent upon the delayed instruction may execute and store results in the buffer. Out of order execution may continue until the reorder buffer becomes full, at which point dispatch is suspended until instructions are deleted from the reorder buffer. Therefore, a larger number of storage locations within the reorder buffer generally leads to increased performance by allowing more instructions to be outstanding before instruction dispatch (and out of order execution) stalls.
Unfortunately, larger reorder buffers complicate dependency checking. One or more source operands of an instruction to be dispatched may be destination operands of outstanding instructions within the reorder buffer. As used herein, a source operand of an instruction is a value to be operated upon by the instruction in order to produce a result. Conversely, a destination operand is the result of the instruction. An instruction specifies the location storing the source operands and the location in which to store the destination operand. An operand may be stored in a register (a "register operand") or a memory location (a "memory operand"). As used herein, a register is a storage location included within the microprocessor which is used to store instruction results. Registers may be specified as source or destination storage locations for an instruction. The registers may be defined by the microprocessor architecture or specified by the particular microprocessor implementation.
The locations from which to retrieve source operands for an instruction to be dispatched are compared to the locations designated for storing destination operands of instructions stored within the reorder buffer. If a dependency is detected and the corresponding instruction has executed, the result stored in the reorder buffer may be forwarded for use by the dispatching instruction. If the instruction has not yet executed, a tag identifying the instruction may be forwarded such that the result may be provided when the instruction is executed.
When the number of instructions storable in the reorder buffer is large, the number of comparisons for performing dependency checking is also large. Generally speaking, the total number of comparisons which must be provided for is the number of possible operands of an instruction multiplied by the number of instructions which may be concurrently dispatched, further multiplied by the number of instructions which may be stored in the reorder buffer. As the number of instructions increases, the amount of circuitry employed to perform the comparisons increases dramatically. The large amount of circuitry may undesirably increase the total amount of circuitry employed by the microprocessor. Additionally, more than one destination operand within the reorder buffer may be stored within the storage location indicated for a source operand. Circuitry is therefore employed to detect the last of the destination operands indicated by the comparisons, in order to correctly detect the dependency (i.e. the instruction which stores a result into a storage location used for a source operand and which is nearest to the dispatching instruction in program order is the instruction upon which the dispatching instruction depends). It is therefore desirable to reduce the complexity of dependency checking for reorder buffers.