This application is based on an application No. 11-195717 filed in Japan, the content of which is hereby incorporated by reference.
1. Field of the Invention
The present invention relates to a compiling device translating a source program into an object program such as a machine language program or an assembler program, and in particular, to improvements achieved when the source program includes a section written in high-level programming language and a section written in assembly language.
2. Description of the Background Art
Recent developments in high-performance integrated microprocessors have led to such microprocessors being used in information processing devices performing multimedia processing such as communication, video processing and audio processing.
The design of programs used in multimedia processing has become increasing unwieldy, so that problems regarding the development cost, maintenance and portability of such programs have multiplied in recent years. Consequently, there is a great demand for developmental environments using high-level programming languages, of which C and C++ have become particularly popular. However, a look at the current state of affairs regarding developmental environments for multimedia processing reveals that multimedia processing is often written using assembly language, which is close to machine language. This seems to go against the prevailing demand for developmental environments written in high-level programming language. Sections of the program that are frequently executed and require a short execution time are written in assembly language. Such sections form the bulk of the mass data processing functions occurring in multimedia processing.
There are several reasons for writing sections of the program in assembly language. First, the translation ability of the compiler is limited. In addition, microprocessors have some machine language instructions which achieve a plurality of functions. Such instructions are increasingly found as multimedia processing instructions, and cannot be efficiently written in high-level programming language, nor be efficiently translated into machine language instructions by the compiler.
Suppose that the microprocessor targeted by the compiler uses a load/store method (a method in which addressing of operands is performed using only transfer instructions as memory-to-memory calculation instructions), and a division instruction div is as shown in FIG. 1D. In FIG. 1D, the division result is stored in a register rn, and a result of remainder calculation in a register mdr. This means that the division instructions used in a majority of microprocessors can perform division and remainder calculation simultaneously using one instruction.
In contrast, when the high-level programming language C is used, and the division and remainder results of variables a and b are to be set respectively as variables c and d, this process can only be written as shown in FIG. 1A. As a result, many compilers generate separate div instructions for each of xe2x80x98/xe2x80x99 calculation and xe2x80x98%xe2x80x99 calculation, as shown in FIG. 1B, (1) and (2). Conventionally, execution of a div instruction requires a greater number of execution cycles than other instructions, so that ideally only one div instruction should be used to perform simultaneous division and remainder calculation, as shown in FIG. 1D. In FIGS. 1B and 1D, registers r2, r3, r4 and r5 are allocated to variables a, b, c and d respectively, and xe2x80x98mov rm, rnxe2x80x99 signifies that the value in register rm is to be transferred to register rn.
Consequently, in the case of most C compilers, a programmer uses asm statements as extended language specifications, so that program description written in C is mixed with description written in assembly language. The example program sections shown in FIG. 1 may be written as assembler statements following the keyword asm, as shown in FIG. 2. Here, parts of the asm statements written using variables at a level equivalent to C may be written into registers and memory allocated by the compiler. For example, if registers r2, r3, r4 and r5 are allocated to variables a, b, c and d respectively, as in FIG. 1, the output of the compiler is as shown in FIG. 2B. FIG. 2B has two more transfer instructions than the ideal situation of FIG. 1D but, since it has only one div instruction, requires less execution time than FIG. 1B. In addition, if conventional copy propagation is optimized (this technique is described in reference 1, listed later in this specification) for FIG. 2B, the program section shown there can be changed to one similar to FIG. 1D.
Furthermore, if it is desirable to insert instructions capable of performing both division and remainder calculation at a plurality of places in the program, a macro is defined as in FIG. 3A, and if this macro is used as shown in FIG. 3B efficiency is increased. Furthermore, the descriptor is the same kind as that used to call a function, so the program becomes easier to read. Note that FIG. 3B shows a situation in which a microprocessor targeted by a C language compiler replaces x, y, z and w with a, b, c and d, as in FIG. 2A. Replacement of variables by a C compiler in this way is known as macro generation. Descriptors that are macro-defined assembler statement sequences having a special function, such as dm in FIG. 3A, are known as xe2x80x98inline assembly routinesxe2x80x99.
However, if inline assembly routines are used in a conventional compiler, a programmer needs to carry out first and second check operations (described below). As a result, programmers are somewhat reluctant to include inline assembly routines when programming.
The program includes a plurality of variables. When a value of a certain variable x is valid for an entire inline assembly routine, the first check operation involves thoroughly checking the object program generated by the compiler to determine whether the value of the variable x has been destroyed. A register r is allocated to the variable x by a process performed by the compiler, so that if the inline assembly routine defines the register r, the value of the register r will differ before and after the inline assembly routine.
Suppose, as shown in FIG. 3C, that a value for a variable a defined at definition point (1) is used at use point (2) and variable a has a live range which extends over the inline assembly routine dm. A register r1 is allocated to the variable a, and if the value of register r1 is changed during the inline assembly routine dm, a value of a which differs from that defined at definition point (1) will be used at use point (2) in FIG. 3C.
In the second check operation, the programmer makes a careful check to determine whether values of parameters have been destroyed, after determining how such values are defined by the inline assembly routine.
In some parameters, values are transferred using specified registers r. The live ranges of such parameters may extend over the inline assembly routine. Here, if a register r is defined by the inline assembly routine, the value of the parameter will be different before and after the inline assembly routine. For example, if, as in FIG. 3D, a parameter p is referenced after the use point of the inline assembly routine dm, and the parameter p is passed to a register r0, it is clear that the value of the register r0 will be destroyed by the inline assembly routine dm, and the value used when the parameter p is referenced will be inaccurate.
These kinds of checks generally create a heavy workload for the programmer, and in an attempt to lighten this burden, many programmers write programs including various restrictions. For example, programmers only use inline assembly routines for functions including variables that can definitely be judged as being allocated to memory or a specified register prior to compiling (global variables and the like). In addition, registers used by parameters are not defined in the inline assembly routine.
Programmers would ideally like to express what were originally independent functions as an inline assembly routine, and use such routines to increase effectiveness, but the above described restrictions hamper such efforts. As a rule, a programmer must use the first and second check operations to determine how the inline assembly routine has been defined. This means that it is difficult for the programmer to use the inline assembly routine with no knowledge of its detailed content, as is necessary when dealing with so-called black boxes (that is programs/program sections whose operational code is confidential or otherwise unknown). Accordingly, it is difficult to increase the reusability of inline assembly routines by changing them into library routines.
A first object of the present invention is to provide a compiling device that translates a program without requiring the programmer to make checks when an inline assembly routine is used.
A second object of the present invention is to provide a compiling device that translates a program so as to enable inline assembly routines to be inserted as black boxes.
A third object of the present invention is to provide a compiling device enabling inline assembly routines to be changed to library routines, thereby increasing the reusability of the program.
As described above, the present invention is a compiling device that translates a program including statements using variables into an object instruction sequence. Assembler instructions defining values for resources are arranged in a section of the program. The compiling device includes a variable detecting unit for detecting variables whose live ranges overlap the section from variables having values defined in the statements, and a resource allocating unit for allocating to each variable detected by the variable detecting unit, a resource different from the resources having values defined in the assembler instructions.
This means that the programmer is not obliged to make checks when an inline assembly routine is used. Since these conventional checks are no longer necessary, assembler statements and inline assembly routines can be written at arbitrary positions in the program. Using inline assembly routines actively in this way improves the execution speed and reusability of the program.
In this invention, the program may be embodied by a plurality of functions, the statements are included in the functions, and assembler instructions are included at least one of the functions. Here, the compiling device further includes a register parameter replacing unit that, when formal parameters that should use registers exist, (1) generates substitution instructions for substituting temporary variables for the values of each formal parameter used in the functions, and inserts each of the generated substitution instructions at the start of a corresponding function, and (2) replaces all of the formal parameter values in the functions with the temporary variables indicated by the substitution instructions. Furthermore, the variable detecting unit detects temporary variables whose live ranges overlap the section; and the resource allocating unit allocates a register different from the registers whose values are defined in the assembler instructions to each detected temporary variable.
Here, a register that has been allocated to a parameter is no longer incorrectly updated in the assembly statements, and the same applies to variables whose live ranges span the assembler statements. This means that inline assembly routines can be freely defined, and used as a black box at an arbitrary position in the program. Using inline assembly routines actively in this way improves the execution speed and reusability of the program.
The variable detecting unit may also include a detecting unit for detecting assembler instructions from the program, a temporary variable generating unit for generating, when assembler instructions are detected, temporary variables already allocated to resources defined in the assembler instructions, a live range setting unit setting the live range of each generated temporary variable to be equal to the section where the assembler instructions are arranged; and a variable detecting unit detecting variables whose live range overlaps the live range set for the temporary variables. Here, the resource allocating unit allocates a resource different from the resources allocated to the temporary variables to each of the detected variables.
Allocated temporary variables are generated from the assembler statements, and resource allocation is performed using a method that integrates these generated temporary variables. In addition the resource allocation method disclosed in reference 2, which actively limits the generation of transfer instructions, and the resource allocation method disclosed in reference 3, which is an expansion method of the widely known graph coloring technique, may be applied.