High performance processors currently used in data processing systems today may be capable of "superscalar" operation and may have "pipelined" elements. A superscalar processor has multiple elements which operate in parallel to process multiple instructions in a single processing cycle. Pipelining involves processing instructions in stages, so that the pipelined stages may process a number of instructions concurrently. In a typical first stage, referred to as an "instruction fetch" stage, an instruction is fetched from memory. Then, in a "dispatch" stage, the instruction is decoded into different control bits, which designate a type of functional unit for performing the operations specified by the instruction, source operands for the operation, and destination registers for results of operations. The decoded instruction is dispatched to an issue queue where instructions wait for data and an available execution unit. Next, in the "issue" stage, an instruction in the issue queue is issued to a unit having an "execution stage." The execution stage processes the operation as specified by the instruction. Executing an operation specified by an instruction includes accepting one or more operands and producing one or more results.
A "completion" stage addresses program order issues which arise from concurrent instruction execution, wherein multiple, concurrently executed instructions may deposit results in a single register. The completion stage also handles issues arising from instructions dispatched after interrupted instruction deposits results in a same destination registers. In the completion stage, an instruction waits for the point at which there is no longer a possibility of an interrupt before storing a data value, so that a depositing of a result will not violate a program order. At this point, the instruction is considered "complete." It should be noted that buffers to store execution results before results are deposited into the destination register and buffers to back up contents at specified checkpoints are provided in the instance an interrupt reverts the register content to its pre-checkpoint value. Either types of buffers may be utilized in a particular implementation. At completion, the results of the execution buffer and the holding buffer will be deposited into the destination register and the back-up buffer will be released.
Many state-of-the art superscalar central processor units that implement an in-order dispatch, out-of-order execution and in-order completion microarchitecture employ register renaming schemes to allow instructions that have output dependence or "anti-dependence" to execute in an order different from a dispatch order. Thus, an instruction that is younger in dispatch order may execute earlier than an older instruction. Additionally, in some circumstances more useful instructions are allowed to be processed per timing cycle.
State-of-the-art register renaming schemes typically implement a double pointer look-up in a register operand read access path. In this implementation, the register is first accessed to obtain a pointer in a future file. The future file location must also be subsequently accessed to retrieve the value. The register operand read-access operation is in a critical timing path that limits the operational speed of the processor. The double pointer look-up lengthens the time required to perform a register operand access, and hence prevents the processor from achieving a highest possible operation frequency. This disadvantage is magnified in processors with a small number of pipeline stages.
An alternative approach to the register renaming scheme described above is a history buffer scheme. In the history buffer scheme, the latest modification to an architected register is always kept in the architected register, as opposed to a future file required by the renaming scheme. Therefore, the double-pointer look up issue associated with the register renaming scheme is eliminated and the history buffer scheme is well suited for processors with a small number of pipeline stages.
Therefore, a need exists for a data processing system and method which insures that instructions are executed correctly and efficiently. A need also exists for a method for storing result data in a history buffer that are produced by older dispatched instructions targeting a particular register as opposed to the latest dispatched instruction in the presence of the possibility that the result data can be produced with an unpredictable length of delay.