1. Field of the Invention
This invention relates to processors, and in particular to out-of-order execution control in a processor having multiple execution units.
2. Description of the Related Art
General purpose computers execute programs which are typically represented in executable form as ordered sequences of machine instructions. Human-readable representations of a program are converted to sequences of machine instructions for a desired target architecture, e.g., to object code for a processor conforming to the x86 processor architecture, in a process known as compilation. Typically, computer programs are designed, coded, and compiled with a simplifying assumption that the resulting object code will be executed in sequential order. However, despite this assumption, modem processor design techniques seek to exploit opportunities for concurrent execution of machine instructions, i.e., instruction parallelism.
To maximize computational throughput, superscalar techniques can be used to map instruction parallelism to multiple execution units. In contrast, pipelining techniques involve the exploitation of instruction parallelism within stages of a single functional unit or execution path. Superscalar techniques, which are known in the art of superscalar processor design, include out-of-order instruction issue, out-of-order instruction completion, and speculative execution of instructions.
Out-of-order instruction issue involves the issuance of instructions to execution units with little regard for the actual order of instructions in executing code. A superscalar processor which exploits out-of-order issue need only be constrained by dependencies between the output (results) of a given instruction and the inputs (operands) of subsequent instructions in formulating its instruction dispatch sequence. Out-of-order completion, on the other hand, is a technique which allows a given instruction to complete (e.g. store its result) prior to the completion of an instruction which precedes it in the program sequence. Finally, speculative execution involves the execution of an instruction sequence based on predicted outcomes (e.g., of a branch). Speculative execution (i.e., execution under the assumption that branches are correctly predicted) allows a processor to execute instructions without waiting for branch conditions to actually be evaluated. Assuming that branches are predicted correctly more often than not, and assuming that a reasonable efficient method of undoing the results of an incorrect prediction is available, the instruction parallelism (i.e., the number of instructions available for parallel execution) will typically be increased by speculative execution (see Johnson, Superscalar Processor Design, Prentice-Hall, Inc., New Jersey, 1991, pp. 63-77 for analysis).
Executing instructions out of sequential order , i.e., issuing and completing instructions out of sequential order, can increase a superscalar processor's performance by allowing the superscalar processor to keep multiple execution units operating in parallel and thereby improving throughput. Accordingly, a scheduler for a superscalar processor can improve overall performance by determining which instructions can be executed out-of-order and providing, or dispatching, those instructions to appropriate execution units. A scheduler for a superscalar processor must also handle interrupts and traps. Many processor architectures, including the x86 processor architecture, require that an architectural state be known just before or after an instruction generates an error, interrupt, or trap. This presents a difficulty when instructions are executed out of sequential order. Therefore, the scheduler must be able to undo instructions and reconstruct the system's state as if instructions executed in sequential order.
Architectural designs for exploiting the instruction parallelism associated with each of these techniques have been proposed in a variety of articles and texts. For a discussion, see Johnson, pp. 127-146 (out of order issue), pp. 103-126 (out-of-order completion and dependency), pp. 87-102 (branch misprediction recovery).