1. Technical Field
The present invention relates generally to an improved data processing system and in particular to a method and apparatus for processing data. Still more particularly, the present invention relates to a method, apparatus, and computer instructions for optimizing execution of instructions.
2. Description of Related Art
Modern computer processors are often able to process many instructions simultaneously, a property known as being superscalar. One method in which processors achieve this property is by pipelining the execution of instructions. In this process, machine instructions are processed in a series of stages that each do some part of the processing, much like an assembly line. The effect of pipelining instructions is that successive instructions can be started down the pipeline before previous instructions are completed.
However, many modern computer processors are not able to pipeline very expensive machine instructions that require more complex circuitry. On many processors, these expensive instructions typically are handled as special cases, which tie up machine resources for many cycles without allowing other instructions to be processed.
In most cases, the mathematical functions performed by these unpipelined instructions can be calculated or approximated using an expanded sequence of simple, pipelined mathematical instructions. For example, the floating point square root instruction can be calculated using the Newton Iteration method, which can commonly be implemented with simpler pipelined floating point operations. Other examples of commonly unpipelined hardware instructions that have pipelined replacement sequences are floating point divide, floating point reciprocal square root, and floating point sin.
In most cases, the unpipelined instruction will have a shorter latency to dependent instructions than an expanded sequence of pipelined instructions. If this was not the case, then the unpipelined instruction would not provide any value, since the expanded sequence of instructions would always be an improvement. Thus, the unpipelined instruction is a good choice when no other instructions can be executed in parallel. However, in cases where other operations can be executed in parallel with the operation, it is profitable to expand the unpipelined instruction into the expanded sequence of pipelined instructions.
If unpipelined instructions are expanded everywhere in a program by an optimizing compiler, then it will benefit the program execution performance in cases where there was code to execute in parallel with the operation, and it will harm the performance in cases where there was no code to execute in parallel. The fundamental difficulty in generating the pipelined expanded sequences in an optimizing compiler is determining when the expansion of unpipelined instructions is profitable.
Therefore, it would be advantageous to have an improved, apparatus, and computer instructions for determining the profitability of expanding unpipelined instructions in code.