Technical Field
The present invention relates generally to information processing and, in particular, to low latency execution of floating-point record form instructions.
Description of the Related Art
Floating-point Record Form instructions are slow compared to regular Floating-point Instructions, because in addition to regular computation, exception bits from all previous instructions must be recorded in a condition register.
In today's processors (including, for example, but not limited to any of the POWER6/POWER7®/POWER8® processors), record form Floating-point (FP) instructions are not issued until all older instructions have completed. The drawback of that approach is the late availability of the result register, likely holding off the execution of subsequent dependent instructions.
Moreover, it is to be noted that there are contradicting requirements for record form FP instructions. For example, for optimum performance, the result FRT is needed as soon as possible, and is thus obtained by using out-of-order execution. In contrast, for the correct Condition Register (CR), all older instructions should be waited for and are thus often executed in-order.
To the preceding end, we note that the following processors have in-order execution capability: POWER6; POWER7®; and POWER8®. These processors have a wait capability that can be applied to the execution of an instruction such that the instruction is not executed until all younger instructions have completed. Accordingly, the result is available late, leading to record form instruction processing being slow in such processors.
As another approach for processing record form instructions, a compiler could avoid having to process record form instructions and instead use a special instruction (mcrfs) in their place. However, such an approach is not without deficiencies. For example, such an approach can unnecessarily hold off execution of younger operations. Also, only programs that can be recompiled will benefit, whereas existing code will still be slow.
Hence, there is a need for a solution for executing record form FP instructions that allows fast execution while still maintaining correct CR result.