1. Field of the Invention
The present invention pertains to the field of computer systems. More particularly, this invention relates to selection of instructions ready for dispatch in a processor that performs out-of-order instruction execution.
2. Background
Typical prior computer processors implement in-order instruction execution pipelines. An in-order processor usually fetches an instruction stream from a memory, and executes each instruction in the instruction stream according to a sequential program order. Such in-order instruction execution ensures that data dependencies among the instructions are strictly observed.
A processor may perform out-of-order instruction execution to increase instruction execution performance. Such a processor executes ready instructions in the instruction stream ahead of earlier instructions in the program order that are not ready. A ready instruction is typically an instruction having fully assembled source data and available execution resources. Typically, the source data or operands for an instruction comprises the contents of one or more internal processor registers, or immediate source data, or a combination thereof.
Such out-of-order execution improves processor performance because the instruction execution pipeline of the processor does not stall while awaiting source data or execution resources for a non-ready instruction. For example, an instruction in the instruction stream may require source data from a processor register, wherein the processor register is loaded by a pending external memory fetch operation. Such an instruction awaiting the results of the external memory fetch does not stall the execution of later instructions in the instruction stream that are ready to execute.
A processor that performs out-of-order instruction execution typically buffers pending instructions in a dispatch buffer while determining whether the pending instructions are ready for dispatch. Typically, a pending instruction is ready for dispatch if the execution resources and source data required by the pending instruction are available.
Pending instructions in the dispatch buffer are often data dependent on previously dispatched instructions. A pending instruction is data dependent on a dispatched instruction if the pending instruction requires the execution results of the dispatched instruction as source data. As a consequence, a pending data dependent instruction must usually wait for the execution results of one or more dispatched instructions before becoming ready for dispatch.
Prior processors that dispatch instructions out-of-order commonly employ content addressable memories to link the execution results from dispatched instructions to the pending instructions that are data dependent on the execution results. For example, such a prior processor may employ a content addressable memory to match one or more source data tags for a pending instruction to result tags of the execution results. The result tags are typically detected by the content addressable memory during write back of the execution results to a result buffer or internal processor registers.
Such a prior content addressable memory scheme typically generates write enable signals if the source data tags match the result tags. The write enable signals are usually employed to write the execution results into the dispatch buffer during write back. Thereafter, the ready instructions are typically selected from among the pending instructions, and the ready instructions are scheduled for dispatch.
Unfortunately in such prior processors, the selection of ready instructions from the pending instructions, and the scheduling of the ready instructions begins after the execution results for data dependent instructions are written into the dispatch buffer. Moreover, such ready determination and instruction scheduling functions usually require several processor clock cycles for completion. As a consequence, a multicycle latency typically occurs between the time that the execution results are available and the time that the execution results dispatched along with the data dependent instructions. Such a latency prevents back-to-back dispatching of data dependent instructions, and thereby reduces overall instruction throughput in such a processor.