(1) Field of the Invention
The present invention relates to a compiler apparatus that translates a source program written in a high-level language to a program written in machine language, and particularly to a compiler apparatus that optimizes a loop.
(2) Description of the Related Art
Recently, application programs present in multimedia devices include many media processes requiring extremely fast processing. A great number of loops are included in such media processing, and these loops require a longer execution time as compared to other processes. As a result, the execution speed of these loops affects the execution speed of the media processing.
Therefore, in the compiler field, many loop optimization techniques exist for improving the execution speed of loops. “Optimization through loop invariant motion” is known as one of these technologies (for example, see Nakata, Ikuo, “Configuration and Optimization of Compilers,” Asakura Shoten, Sep. 15, 1999; pp. 242-243 and 284-287).
Optimization through loop invariant motion is an optimization method in which instructions unrelated to the execution of each loop iteration are moved outside of the loop, thereby reducing the number of instructions within the loop.
For example, in the case where each iteration in the loop, defining the same value to the same register (for example, assigning the same value to the same register) alone is repeated, and the value of that register essentially does not change through the entire loop process, that instruction may be moved outside of the loop, thereby creating a situation in which the defined value is retained throughout the entire loop.
However, in order to retain a value defined by an instruction located outside of the loop throughout the entire loop process, a certain register must be appropriated throughout the entire loop. For example, an intermediate program 10 such as that shown in FIG. 1 (a) can be considered. Here, an instruction 10a (mov vr20, 5000) is an instruction that assigns a constant 5000 to a virtual register vr20, and an instruction 10b (add vr1, vr0, vr20) is a command that adds a value held in a virtual register vr0 to a value held in a virtual register vr20 and assigns the resultant to a virtual register vr1.
In other words, in the intermediate program 10, the virtual register vr20 is defined in the instruction 10a, and the virtual register vr20 is referred to in the instruction 10b; the virtual register vr20 must hold the value 5000 throughout the entire loop. Accordingly, when a value defined outside or the loop is only referred to within the loop, a register that holds the value defined outside of the loop throughout the entire loop is necessary.
FIG. 1 (b) shows an intermediate program 12 generated as a result of instruction scheduling on the intermediate program shown in FIG. 1 (a), in the case where the target processor is a processor that can execute three instructions in parallel. In addition, in FIG. 1 (b), a live range chart 14, which shows live ranges of the virtual registers used in the intermediate program 12, is shown along with the intermediate program 12. Here, “live range” indicates an area from where the virtual register is defined to where the virtual register is last referred to. As shown in the live range chart 14, the virtual register vr20 is live throughout the entire loop. From the live range chart 14, it can be seen that the maximum number of registers required by the intermediate program 12 is six.
When there are plural dependencies between definitions and references, such as dependencies between the instruction 10a and the instruction 10b, more registers are occupied throughout the entire loop, reducing the number of registers that can be used during the loop. Therefore, there is an increased chance that register spilling (temporarily transferring computational results to a memory due to insufficient registers) will occur. Even if register spilling does not occur, and a backup register is used within a certain function, code must be generated for saving/returning values held by the backup register at the beginning and end of that function to/from a stack.
In other words, performing loop invariant motion tends to exacerbate situations such as where register spilling occurs during the loop or code for saving/returning values held in a backup register to/from a stack is executed. Accordingly, there is a problem in that performance may be reduced due to register spilling being caused, or save/return code being executed, through the same loop invariant motion meant to improve the performance of the loop.