Software engineers mainly use a high-level language, such as the C language, as a programming language to develop computer software. Source code written in the high-level language is converted into object code by a compiler. The object code is code that is executable by processors, such as a Central Processing Unit (CPU). Some compilers may perform a so-called optimization process so as to generate object code having high execution efficiency (for example, short execution time and low memory usage). The optimization process includes combining two or more of basic instructions for addition, subtraction, multiplication, division, load, store, and the like, into one equivalent instruction, so as to reduce the number of instructions in the object code.
Some processors are able to execute Single Instruction Multiple Data (SIMD) instructions. When receiving a SIMD instruction, a processor performs the same type of operations using different data in parallel. For example, assume that data A1 and data A2 are stored in a SIMD register s1, and data B1 and data B2 are stored in a SIMD register s2. When receiving a SIMD instruction for s1+s2, a processor performs two additions, A1+B1 and A2+B2, in parallel. In the case of generating object code for this processor to execute, a compiler may perform an optimization process by converting two or more instructions that specify the same operation type and are executable in parallel into a SIMD instruction.
Further, some processors may be able to execute Fused Multiply and Add or Floating point Multiply and Add (FMA) instructions. Assume now that there are data A, B, and C. When receiving a FMA instruction, a processor performs a multiplication and an addition, A×B+C. In the case of generating object code for this processor to execute, a compiler may perform an optimization process by combining an instruction for multiplication and an instruction for addition using the result of the multiplication into a FMA instruction. Still further, some processors may be able to execute SIMD-FMA instructions, which are a combination of SIMD and FMA. For example, assume that data A1 and data A2 are stored in a SIMD register s1, data B1 and data B2 are stored in a SIMD register s2, and data C1 and data C2 are stored in a SIMD register s3. When receiving a SIMD-FMA instruction for s1×s2+s3, the processor performs two operations, A1×B1+C1 and A2×B2+C2, in parallel.
For performing such an optimization process, there is proposed a computer system that uses a trace dependency tree representing dependency relations among a plurality of instructions. This computer system searches the trace dependency tree for two or more instructions that specify the same operation type and belong to the same level, and converts the found two or more instructions into one SIMD instruction.
Please see, for example, International Publication Pamphlet No. WO 2006/007193.
A dependency tree that represents dependency relations among the instructions included in code prior to optimization may be a large-scale tree, including a variety of basic instructions for addition, subtraction, multiplication, division, load, store, and the like. To find combinations of two or more instructions that are convertible into another kind of instructions, such as SIMD instructions, searching such a dependency tree may need a large amount of computation. Therefore, it may take a long time to perform an optimization process. For example, in the case where a dependency tree has many instructions that specify the same operation type at the same level, there are many combination candidates of instructions to be converted into SIMD instructions, and therefore a large amount of computation is needed to find a conversion pattern that achieves high execution efficiency.