1. Field of the Invention
This invention is related to the field of processors and, more particularly, to instruction scheduling mechanisms within processors.
2. Description of the Related Art
Superscalar processors attempt to achieve high performance by issuing and executing multiple instructions per clock cycle and by employing the highest possible clock frequency consistent with the design. One method for increasing the number of instructions executed per clock cycle is out of order execution. In out of order execution, instructions may be executed in a different order than that specified in the program sequence (or xe2x80x9cprogram orderxe2x80x9d). Certain instructions near each other in a program sequence may have dependencies which prohibit their concurrent execution, while subsequent instructions in the program sequence may not have dependencies on the previous instructions. Accordingly, out of order execution may increase performance of the superscalar processor by increasing the number of instructions executed concurrently (on the average). Another method related to out of order execution is speculative execution, in which instructions are executed subsequent to other instructions which may cause program execution to proceed down a different path than the path containing the speculative instructions. For example, instructions may be speculative if the instructions are subsequent to a particular instruction which may cause an exception. Instructions are also speculative if the instructions are subsequent to a predicted conditional branch instruction which has not yet been executed. Similarly, instructions may be out of order or speculatively scheduled, issued, etc.
Unfortunately, scheduling instructions for out of order or speculative execution presents additional hardware complexities for the processor. The term xe2x80x9cschedulingxe2x80x9d generally refers to selecting an instruction for execution. Typically, the processor attempts to schedule instructions as rapidly as possible to maximize the average instruction execution rate (e.g. by executing instructions out of order to deal with dependencies and hardware availability for various instruction types). These complexities may limit the clock frequency at which the processor may operate. In particular, the dependencies between instructions must be respected by the scheduling hardware. Generally, as used herein, the term xe2x80x9cdependencyxe2x80x9d refers to a relationship between a first instruction and a subsequent second instruction in program order which requires the execution of the first instruction prior to the execution of the second instruction. A variety of dependencies may be defined. For example, a source operand dependency occurs if a source operand of the second instruction is a destination operand of the first instruction.
Generally, instructions may have one or more source operands and one or more destination operands. The source operands are input values to be manipulated according to the instruction definition to produce one or more results (which are the destination operands). Source and destination operands may be memory operands stored in a memory location external to the processor, or may be register operands stored in register storage locations included within the processor. The instruction set architecture employed by the processor defines a number of architected registers. These registers are defined to exist by the instruction set architecture, and instructions may be coded to use the architected registers as source and destination operands. An instruction specifies a particular register as a source or destination operand via a register number (or register address) in an operand field of the instruction. The register number uniquely identifies the selected register among the architected registers. A source operand is identified by a source register number and a destination operand is identified by a destination register number.
In addition to operand dependencies, one or more types of ordering dependencies may be enforced by a processor. Ordering dependencies may be used, for example, to simplify the hardware employed or to generate correct program execution. By forcing certain instructions to be executed in order with respect to certain other instructions, hardware for handling consequences of the out of order execution of the instructions may be omitted. For example, instructions which update special registers containing general processor operating state may affect the execution of a variety of subsequent instructions which do not explicitly access the special registers. Generally, ordering dependencies may vary from microarchitecture to microarchitecture.
While the scheduling mechanism respects dependencies, it is desirable to be as aggressive as possible in scheduling instructions out of order and/or speculatively in an attempt to maximize the performance gain realized. However, the more aggressive the is scheduling mechanism (i.e. the fewer conditions which may prevent a particular instruction from being scheduled), the more likely the occurrence of an incorrectly executed instruction becomes. The recovery mechanism for incorrectly executed instructions has generally been to purge the incorrectly executed instruction and all subsequent instructions from the processor pipeline and to refetch the incorrectly executed instruction (and subsequent instructions). Often, the purging and refetching is delayed from the discovery of incorrect execution for hardware simplicity (e.g. until the incorrectly executed instruction is the oldest instruction in flight). The average number of instructions actually executed per clock cycle is decreased due to the incorrect execution and the subsequent purging events. For aggressive scheduling mechanisms which encounter incorrect execution more frequently, the performance degradation attributable to these recovery mechanisms may be substantial. Accordingly, a mechanism for recovering from incorrect speculative execution which preserves performance gains made possible by aggressive speculative or out of order scheduling is desired.
The problems outlined above are in large part solved by a scheduler as described herein. The scheduler issues instruction operations for execution, but also retains the instruction operations. If a particular instruction operation is subsequently found to be incorrectly executed, the particular instruction operation may be reissued from the scheduler. Advantageously, the penalty for incorrect scheduling of instruction operations may be reduced as compared to purging the particular instruction operation and younger instruction operations from the pipeline and refetching the particular instruction operation. Performance may be increased due to the reduced penalty for incorrect execution. Furthermore, the scheduler may employ a more aggressive scheduling mechanism since the penalty for incorrect execution is reduced.
Additionally, the scheduler may maintain the dependency indications for each instruction operation which has been issued. If the particular instruction operation is reissued, the instruction operations which are dependent on the particular instruction operation (directly or indirectly) may be identified via the dependency indications. The scheduler reissues the dependent instruction operations as well. Instruction operations which are subsequent to the particular instruction operation in program order but which are not dependent on the particular instruction operation are not reissued. Accordingly, the penalty for incorrect scheduling of instruction operations may be further decreased over the purging of the particular instruction and all younger instruction operations and refetching the particular instruction operation. Performance may thus be further increased.
Broadly speaking, a scheduler is contemplated. The scheduler comprises an instruction buffer configured to store a first instruction operation, an issue pick circuit coupled to the instruction buffer, and a control circuit. The issue pick circuit is configured to select the first instruction operation for issue from the instruction buffer. Coupled to the issue pick circuit, the control circuit is configured to maintain a first execution state of the first instruction operation. The control circuit is configured to change the first execution state to an executing state responsive to the issue pick circuit selecting the first instruction operation for issue. Additionally, the control circuit is configured to change the first execution state to a not executed state responsive to a first signal indicating that the first instruction operation is incorrectly executed.
Additionally, a processor is contemplated comprising a scheduler and an execution unit. The scheduler is configured to store a first instruction operation and to issue the instruction operation for execution. The scheduler is configured to maintain a first execution state corresponding to the first instruction operation, and is configured to change the first execution state to an executing state responsive to issuing the first instruction operation. Coupled to the scheduler to receive the first instruction operation in response to an issuance thereof by the scheduler, the execution unit is configured to execute the first instruction operation. The control circuit is configured to change the first execution state to a not executed state responsive to a first signal indicating that the first instruction operation is incorrectly executed. Still further, a computer system is contemplated including the processor and an input/output (I/O) device configured to communicate between the computer system and another computer system to which the I/O device is couplable.
Furthermore, a method is contemplated. A first instruction operation is issued from a scheduler to an execution unit. The first instruction operation is retained in the scheduler subsequent to the issuing. A first signal is received that the first instruction operation executed incorrectly. The first instruction operation is reissued responsive to receiving the first signal.
Moreover, a processor is contemplated. The processor comprises a scheduler and an execution unit. The scheduler is configured to store a first instruction operation and to issue the first instruction operation for execution. The scheduler is configured to retain the first instruction operation subsequent to issuing, and is coupled to receive a first signal indicating that the first instruction operation is incorrectly executed. In response to the first signal, the scheduler is configured to reissue the instruction operation responsive to the first signal. Coupled to the scheduler to receive the first instruction operation in response to an issuance thereof by the scheduler, wherein the execution unit is configured to execute the first instruction operation.