1. Field of the Invention
The present invention relates to a code conversion method and apparatus for converting the content of an assembly language code with which a computer system can perform a given instruction.
This application s the priority of the Japanese Patent Application No. 2002-167446 filed on Jun. 7, 2002, the entirety of which is incorporated by reference herein.
2. Description of the Related Art
For getting a processor like CPU or DSP to execute a source program stated in a high-level programming language such as the C language or the like, the code of the source program has to be converted into an execute form that enables such processors to perform instructions specified by the source program.
FIG. 1 shows a flow of operations made in the conventional conversion of a source program stated in a high-level programming language into an execute form that enables a processor to perform instructions specified by the source program.
In the conventional code conversion method, a source program stated in a high-level programming language such as the C language or the like is converted by a compiler into an assembly language code that can be understood by an object processor (in step S50-1). In the assembly language code, one line states one instruction whose operand is a program code directly representing an address in a data register in the processor hardware. Next, the assembly language code is converted by an assembler into a machine language that can be understood by the object processor (in step S50-2). Next, a plurality of programs stated in the machine language is combined together and converted into a code executable by the object processor (in step S50-3).
Note that for many of the recent processors such as CPU, DSP and the like, a fixed-point instruction as an operand and a floating-point instruction as an operand are included in a set of instructions. A floating-point instruction is normally performed by a floating-point unit in the processor. The floating-point unit will perform the instructions referencing to operands stored in a floating-point register provided separately from a fixed-point register and also assign a determined floating-point value in the floating-point register.
The floating-point is defined in the IEEE 754 “Floating-point Standard”. In the definition in the IEEE 754, the floating-point includes a single-precision floating-point and a double-precision one. The floating-point unit can normally make a single-precision operation in which a single-precision floating-point value is used as an operand and a double-precision operation in which a double-precision floating-point value is used as an operand. In the floating-point register, each address is set in units of a single-precision floating-point bit. Therefore, in the single-precision operation, a value stored in one floating-point register is manipulated as an operation for one word while in the double-precision operation, values stored in two floating-point registers are combined and manipulated as an operand for one word.
For the floating-point unit in a data processor, there is provided an arithmetic operation library capable of performing arithmetic operation instructions and also calculating a trigonometric function, logarithmic function etc. Generally, the double-precision operation is used when the arithmetic operation library is used.
Note here that double-precision floating-point instruction data in two registers are combined together and manipulated as one operand for one word but only the address in one of the registers is assigned as the operand. Therefore, when the compiler compiles a double-precision floating-point instruction, it will assign only the even-number register address to an operand for the double-precision floating-point instruction with the odd-number register address being always opened. By assigning the registers in this way, two register areas are assigned to an double-precision floating-point operand.
FIG. 2 shows an example of the assembly language code statement including double-precision floating-point instructions, and FIG. 3 shows the use of a floating-point register and memory when the assembly language code stated as shown in FIG. 2 is performed.
In the assembly language code shown in FIG. 2, a double-precision floating-point value stored in memories MEM[0] and MEM[1] are loaded to floating-point registers FR0 and FR1, respectively, under a double-precision load instruction LD.D (in step S51-1). Next, an arithmetic operation library for an SIN function is called under a call instruction CAL SIN in the arithmetic operation library and double-precision SIN function values calculated on the basis of the double-precision floating-point value stored in the floating-point register FR0 is stored into floating-point registers FR2 and FR3, respectively (in step S51-2). Next, the double-precision floating-point values stored in the floating-point registers FR2 and FR3 are stored into memories MEM[2] and MEM[3], respectively, under a double-precision store instruction ST.D.
By assigning only the even-number register address to the operand as above, two successive register areas can be assigned to a double-precision floating-point instruction operand.
Also, the processor normally makes reference to and assigns both the single-precision floating-point instruction operand and double-precision floating-point instruction operand with the use of the same floating-point register. On this account, the compiler has to assign only the even-number register address as an operand for no conflict for a register with the double-precision floating-point instruction when it compiles the single-precision floating-point instruction as well as it compiles the double-precision floating-point instruction.
FIG. 4 shows an example of the assembly language code statement including double- and single-precision floating-point instructions, and FIG. 5 shows the use of the a register and memory when the assembly language code stated as shown in FIG. 4 is executed.
In the assembly language code shown in FIG. 4, a single-precision floating-point value stored in the memory MEM[0] is loaded to floating-point register FR4 under a single-precision load instruction LD.S (in step S53-1). Next, a single-precision floating-point value stored in the memory MEM[1] is loaded to a floating-point register FR6 under the single-precision load instruction LD.S (in step S53-2). Then, single-precision floating-point values stored in the floating-point registers FR4 and FR6 are multiplied under a single-precision multiply instruction MUL.S and the result of multiplication is stored into the floating-point register FR0 (in step S53-3). Next, an arithmetic operation library for an SIN function is called under a call instruction CALL SIN in the arithmetic operation library and double-precision SIN function values calculated on the basis of the single-precision floating-point value stored in the floating-point register FRO is stored into floating-point registers FR2 and FR3, respectively (in step S53-4). Then, the double-precision floating-point values stored in the floating-point registers FR2 and FR3 are stored into memories MEM[2] and MEM[3], respectively, under the double-precision store instruction ST.D.
That is, when the compiler compiles the floating-point instruction, it will generate an assembly language code having an operand to which only the floating-point register at an even-number address is assigned, whether the floating-point instruction is a single-precision one or double-precision one.
However, in case only an even-number register address is assigned to an operand, for example, in the case of a program including no double-decision floating-point instructions or a program including a very small number of double-decision floating-point instructions, about a half of all the floating-point registers will not be used, that is, the floating-point registers cannot be used efficiently. If the registers are not usable efficiently, the number of the registers will possibly be insufficient for instructions. When the registers are insufficient in number, data will be saved into the memory and registers are released for other instructions, and after completion of the instructions, the saved data has to be returned to the registers, which will need more instructions than in case the registers are sufficient in number for instructions.
Also, the recent processors such as a CPU and DSP can perform operations to be done under one instruction, such as performance of an operation, storage of operation result, etc. in parallel in the hardware. Such a hardware configuration is generally called “pipeline configuration”. FIG. 6 shows, for example, a data processing timing of a pipeline-configured processor (will be referred to as “pipeline processor” hereunder) which performs one instruction for operations from instruction fetch (IF) to operation result storage (FWB) with respective seven clocks. Such a pipeline processor can perform one instruction apparently with one clock.
However, even such a pipeline data processor cannot perform instructions with an improved efficiency when interlocking takes place between the instructions to be performed.
For example, an assembly language code as shown in FIG. 7 is assumed here that is structured for performing an instruction (MUL.S FR0, FR2, FR4) for multiplication of single-precision floating-point values in the floating-point registers FR2 and FR4 by each other and then for storing the multiplication result into the floating-point register FR0, and then performing an instruction (ADD. S FR6, FR0, FR10) for adding the single-precision floating-point values stored in the floating-point registers FR0 and FR10 and storing the addition result into the floating-point register FR6. Even if the data processor has executed such an assembly language code, it has to wait for performance of the next instruction (ADD.S FR6, FR0, FR10) until the result of operation under the instruction (MUL.S. FR0, FR2, FR4) is written into the register as shown in FIG. 8.
That is, in case there is a register-dependency relation between a plurality of instructions, no improvement in efficiency of instruction performance can be attained in the pipeline processing unless there is provided a sufficient margin between the instructions. In the seven-step pipeline data processor as shown in FIG. 6, interlocking will take place between the instructions unless more than five instructions lie between the instructions which are in the register-dependency relation with each other.
The above interlocking will probably take place less frequently if sufficiently many registers are available. However, the floating-point registers are used with low efficiency as above. Therefore, the interlocking will take place with a high frequency.
In the assembly language code generated by the conventional compiler, since only an even-number register address is assigned as an operand in a floating-point instruction, the number of instructions increases and interlocking possibly takes place with a high frequency in the pipeline data processor. As a result, with an assembly language code generated by the conventional compiler, a program cannot be executed with a low efficiency.