It is a goal of microprocessor design to execute as many instructions per unit time as possible. This goal is furthered by processing multiple instructions from an instruction sequence in parallel and out of order when speed efficiency is gained thereby. Processors having this capability are known in the art as superscaler microprocessors.
Instruction sequences typically contain instructions, produced by a programmer, specifying operations to be performed on operands, and specifying an architected logical register into which to store a result. The operands themselves may be specified as being the contents of particular architected logical registers or constants. Inherent to the instruction sequence are data and control dependencies. For example, later instructions may depend on data produced by an earlier instruction. Therefore, to execute the instruction sequence out of order, the data and control dependencies must be managed. Otherwise, incorrect results will be produced.
In the case of branch instructions in the instruction sequence, or other instructions which cause exceptions or interrupts, the processor must be able to determine the correct "machine state" of the superscaler processor at a given place in the instruction sequence. The machine state is the state of all registers defined to software programmers, which includes the architected logical registers. Following is an example illustrating an exception in a program sequence. Assume that an instruction, situated later in the program sequence than the instruction producing the exception, executed first and wrote data to a register. The data written would not be valid, because the machine state of the processor must be determined based on instructions prior to the exception in the program sequence. Thus, the data produced by all instructions subsequent to the exception must be disregarded, and the state of all architected logical registers must be determined.
In prior art microprocessors, the integrity of data and control flow of an instruction sequence was maintained during out of order execution of instructions by completing the instructions in the program sequence. Results from instructions were temporarily stored in rename registers, and subsequently the results were written back into fixed architected logical registers in the program order. Thus, the fixed architected logical registers always contained the correct machine state of the processor.
A consequence of having to complete instructions in the program order is that instructions that take multiple cycles to execute will cause a bottleneck to occur at the instruction completion stage. Subsequent instructions will not be able to complete or write data until the multiple-cycle instruction completes. Depending on the number of cycles that it takes for the multiple-cycle instruction to execute, the bottleneck can prevent the processor from fetching and processing new instructions, thus reducing the processor throughput.
In order to avoid bottleneck problems associated with the completion of instructions in the program order, a distributed mechanism for handling control-flow dependencies has been disclosed in co-pending U.S. patent application Ser. No. 08/377,813, filed Jan. 25, 1995, now abandoned, and is hereby incorporated by reference. Data dependencies, which also have to be managed in order to allow instruction results to be written out of order, and to maintain the correct machine state of the processor, are revealed in this disclosure.