In the continuing development of faster and more powerful computer systems, a significant microprocessor innovation has been utilized, known as a reduced instruction set computer (RISC) processor. Increased advances in the field of RISC processors have led to the development of superscalar processors. Superscalar processors, as their name implies, perform functions not commonly found in traditional scalar microprocessors. Included in these functions is the ability to execute instructions out-of-order with respect to the program order. Although the instructions occur out-of-order, the results of the executions appear to have occurred in program order, so that proper data coherency is maintained.
In a superscalar processor, certain instructions may depend on the execution of another instruction by a unit, but not require the same unit for execution. For example, a floating point store instruction often depends on a previous floating point arithmetic instruction to provide the data to be stored.
Once the data from the previous floating point arithmetic instruction is obtained, the store instruction itself does not require the floating point arithmetic unit to be executed. Instead, the source register for the store instruction is the same as the target register for the floating point arithmetic instruction. Because the store instruction depends on the arithmetic instruction, the store instruction is held until the arithmetic instruction has been completed. This creates a delay, or bubble, in the floating point execution pipeline. This delay can be a multi-cycle delay.
In order to address this problem and increase the speed of floating point operation, some conventional systems forward the store instruction using the arithmetic instruction. Typically, this is accomplished by identifying the store instruction with a unique tag. The tag is then appended to the arithmetic instruction on which the store instruction depends. The store instruction is thereby forwarded, or folded, into, the arithmetic instruction.
Once the store has been forwarded through the arithmetic instruction, the store instruction is removed from the floating point instruction queue. When execution of the arithmetic instruction is completed, the floating point unit immediately processes the store instruction. Thus, the system writes to the floating point register and signals the data cache to access the data for the store. Consequently, a separate instruction is made unnecessary.
This conventional method can forward a store instruction. However, conventional systems are only capable of forwarding a store instruction from the bottom, oldest entry in the floating point instruction queue into the floating point arithmetic instruction that is in the first stage of the execution unit's pipeline. Thus, the floating point arithmetic instruction on which the store instruction depends must also immediately precede the store instruction for forwarding to occur. Where the floating point arithmetic instruction does not immediately precede the store instruction, for example because the floating point instruction is in the second stage of the pipeline, the store instruction will not be forwarded.
If the store instruction is forwarded into the floating point arithmetic instruction, the store instruction is removed from the floating point instruction queue. Another instruction can then replace the store. However, this can only occur as soon as the next cycle, when the store should be placed in the first stage of the pipeline. Thus, removal of the store instruction leaves the first stage of the pipeline, the execution stage behind the instruction to which the store instruction is forwarded, empty. This creates a delay in the floating point unit. The speed of the floating point unit is thereby reduced.
Accordingly, what is needed is a system and method for instruction forwarding with an increased probability of forwarding. In addition, the method and system should reduce delays due to removal of the forwarded instruction from the execution unit's instruction queue. The present invention addresses such a need.