High performance processors currently used in data processing systems today may be capable of superscalar operation and may have pipelined elements. A superscalar processor has multiple elements which operate in parallel to process multiple instructions in a single processing cycle. Pipelining involves processing instructions in stages, so that the pipelined stages may process a number of instructions concurrently.
In a typical first stage, referred to as an instruction fetch stage, an instruction is fetched from memory. Then, in a decode stage, the instruction is decoded into different control bits, which in general designate i) a type of functional unit for performing the operation specified by the instruction, ii) source operands for the operation and iii) destinations for results of operations. In a dispatch stage, the decoded instruction is dispatched per the control bits to a unit having an execution stage. This stage processes the operation as specified by the instruction. Executing an operation specified by an instruction includes accepting one or more operands and producing one or more results.
While instructions may originally be prepared for processing in some programmed, logical sequence, it should be understood that for concurrent processing of multiple instructions, the instructions may be processed, in some respects, in a different sequence. However, since instructions are not totally independent of one another, complications arise. That is, the processing of one instruction may depend on a result from another instruction. For example, the processing of an instruction which follows a branch instruction will depend on the branch path chosen by the branch instruction. In another example, the processing of an instruction which reads the contents of some memory element in the processing system may depend on the result of some preceding instruction which writes to that memory element.
As these examples suggest, if one instruction is dependent on a first instruction and the instructions are to be processed. concurrently or the dependent instruction is to be processed before the first instruction, an assumption must be made regarding the result produced by the first instruction. The state of the processor, as defined at least in part by the content of registers the processor uses for execution of instructions, may change from cycle to cycle. If an assumption used for processing an instruction proves to be incorrect then, of course, the result produced by the processing of the instruction will almost certainly be incorrect, and the processor state must recover to a state with known correct results up to the instruction for which the assumption is made. (Herein, an instruction for which an assumption has been made is referred to as an "interruptible instruction", and the determination that an assumption is incorrect, triggering the need for the processor state to recover to a prior state, is referred to as an "interruption" or an "interrupt point".) In addition to incorrect assumptions, there are other causes of such interruptions requiring recovery of the processor state. Such an interruption is generally caused by an unusual condition arising in connection with instruction execution, error, or signal external to the processor.
A "completion" stage deals with program order issues that arise from concurrent execution, wherein multiple, concurrently executed instructions may deposit results in a single register. It also handles issues arising from instructions subsequent to an interrupted instruction depositing results in their destination registers. In connection with the prior art completion stage, information about instructions is typically kept in dispatch order in a completion buffer (also known as a reorder buffer) so that the instructions can be reordered in dispatch order after having been executed out of order. Also, the results of instruction execution are temporarily held in one or more backup buffers (also known as rename buffers). Once it is known that such execution results are committed they are ultimately written back to architected registers in accordance with the sequence of instructions in the completion buffer. After execution results corresponding to completion buffer entries have been written back to architected registers from rename buffers, completion and rename buffer entries are then released so that they can be filled with information for subsequently dispatched instructions.
Certain timing aspects of the above described dispatching, completion and writeback are shown in FIG. 1 for two exemplary instructions. In the example, the two instructions are fetched, decoded and dispatched in corresponding cycles. It is assumed in the example that the second instruction is not dependent on the first instruction for its source operand, and that the second instruction executes more quickly than the first. Nevertheless, completion of the second instruction requires execution and completion of the first. So, FIG. 1 shows completion and writing back of the second instruction being held up until execution and completion of the first. Instruction 2 executes in cycle 3, but a number of cycles are spent before the first instruction is executed. This could be due to waiting for data access, branch resolution, etc. Then, in cycle 6 of the example, instruction 1 completes. In response, instruction 2 completes. In the example, instruction 2 is shown completing in cycle 7, although it is understood that it might be possible, in a machine that has the capability to complete numerous instructions in a single cycle, for instruction 2 to complete in cycle 6 in response to the completion in that same cycle of instruction 1. The instructions are shown writing back results of execution in cycles 7 and 8 respectively.
As the example illustrates, the present method and apparatus for dispatching, completing and writing back instructions may delay release of completion and rename buffer entries for a second instruction, due to waiting for execution and completion of a preceding instruction. This may delay dispatching of subsequent instructions. Therefore a need exists for improving the method and apparatus for processing instructions.