Typical prior computer systems implement in-order execution pipelines. An in-order execution processor usually fetches an instruction stream from a memory and executes each instruction in the instruction stream according to a sequential program order. Accordingly, an instruction in an in-order execution processor generally begins execution after completion of execution of prior instructions in the program order.
Hence, a delay in completion of execution of an earlier instruction in the program order usually delays the beginning of execution of a later instruction. Such delay in completion has several causes. For example, the earlier instruction may need to wait for an operand to be fetched from an external storage device. The resulting delay in the beginning of execution of the later instruction similarly delays the beginning of execution of even later instructions in the program order. As a consequence, the instruction execution performance of such in-order execution processors is generally limited.
A processor may incorporate an out-of-order execution pipeline to increase the instruction execution performance. Out-of-order execution processors typically execute instructions according to the availability of the resources required to execute each instruction. Such out-of-order execution processors typically execute later instructions in the sequential program order that have all the resources required for execution available ahead of earlier instructions that are awaiting availability of the corresponding required resources. The instruction execution performance of such out-of-order execution processors is enhanced over in-order processors because a delay in the completion of earlier instructions in the program order typically does not delay the beginning of execution of later instructions in the program order.
An out-of-order execution processor typically buffers the fetched instructions, determines whether each buffered instruction has the resources required for execution available, and then dispatches each instruction for execution as the resources required for execution are available. The resources required to execute an instruction usually include operands of the instruction and hardware execution resources. Instructions having such required resources available are said to be ready for execution.
Typically, the operands required to execute an instruction are available as immediate data in the instruction stream or as an execution result of another instruction. For example, in an instruction stream comprising instructions 101, 102, wherein EQU Instruction 101: A=5+3; and EQU Instruction 102: B=A*9,
instruction 101 has both operands available in the instruction stream because instruction 101 includes constant values 5 and 3 as operands.
On the other hand, the operand `A` for instruction 102 is the execution result of instruction 101. An instruction, such as 102, which requires the execution result of another instruction (101 in the above example) for an operand is referred to as a data dependent instruction. An instruction such as 101 that provides an operand for a data dependent instruction (102 in the above example) is referred to as a source instruction.
In an out-of-order execution processor, a latency typically exists between the time the execution result of a source instruction is generated and the time a corresponding data-dependent instruction begins execution. Such latency typically includes time for providing the execution result to the data dependent instruction, time for determining whether the data-dependent instruction is ready for execution, and time for dispatching the data dependent instruction to the appropriate execution resource. Unfortunately, such latencies usually decrease the instruction throughput performance of such an out-of-order execution processor.
The resources required for executing an instruction include an execution resource specified by an operation code of the instruction. For example, an integer addition instruction requires an execution resource that performs integer addition operations.
Typically, the required execution resource for an instruction may not be available to accept the instruction for execution at the time the required operands are available. For example, a non-pipelined execution resource is generally not available to accept additional instructions while executing another instruction.
In addition, a latency typically exists between the time an execution resource becomes available and the time an instruction requiring that execution resource begins execution. Such latency usually includes time for determining the availability of the execution resource, time for determining whether the instruction is ready for execution, and time for dispatching the instruction to the execution resource. Unfortunately, such latency causes the execution resource to remain idle after becoming available. Such idle time reduces the instruction throughput performance of the processor because the execution resources are not completely utilized.
In addition, such an out-of-order execution processor typically resolves conflicts among instructions for hardware resources other than execution resources in order to determine whether an instruction is ready for execution. Such conflicts arise, for example, if more than one instruction requires a resource that cannot process multiple instructions at the same time.
For example, an out-of-order execution processor usually includes a write back port shared by multiple execution resources. The execution resources generally provide execution results to the buffered data-dependent instructions using the write back port. Such write-back port typically permits transfer of only one execution result during each clock cycle. Unfortunately, a conflict usually arises for the write back port when more than one execution resource generates an execution result for a data dependent instruction at the same time.