1. Field of the Invention
The present invention is related to an optimized program code generator, a method for compiling a source text and a computer-readable medium for a processor capable of operating with a plurality of instruction sets.
2. Prior Art
The use of compilers and assemblers in data processing systems to generate assembly code from high level source code or object code from assembly code is well known. Current compilers employ a variety of optimization techniques in order to generate assembly code capable of more efficiently executing program routines, while some skill of a programmer is required for hand assembling.
On the other hand, some processors are capable of operating on the basis of either one of a plurality of instruction sets in a computer system. xe2x80x9cTX19xe2x80x9d as distributed from Toshiba Corporation is one example of CPUs of this type. In this case, programmers are responsible to select appropriate ones of the plurality of instruction sets as available, e.g., repeating compiling process with different instruction sets.
It will be important to switch the operation mode of the CPU to operate under appropriate one of the plurality of instruction sets from the view point of the size of a memory space to be occupied by the target program and the speed of execution of code. Usually, a programmer has to select the instruction set to be used for each module or function, taking into consideration the purpose of the target program and the restriction upon usage of functions of the CPU.
FIG. 31 is block diagram showing a prior art programming language processor. First of all, the source text 1 of a target program is prepared by coding operation. Next, the source text of the target program is input to a compiler 3. The compiler 3 outputs assembly code 5 through syntax analysis and subsequent processes. Next, the assembly code 5 is input to an assembler 7. In this case, the relocatable object code 9 is generated. Next, the object code 9 is input to a linker 11. In this case, executable code 13 with absolute addresses is generated by the linker 11 from the relocatable object code.
In the case that a plurality of instruction sets are available, a plurality of executable codes 13 are generated with different instruction sets by the programming language processor followed by running the executable codes respectively in order to compare the speeds of execution of code thereof.
While the selection of the most effective instruction set among from the available instruction sets has been manually conducted by each programmer on the basis of experience and examination, it is very difficult to make effective selection because technical factors to be taken into consideration are so many with many combinations of the instruction sets. Conventional compilers make use of the default instruction set unless explicitly indicated.
The present invention has been made in order to solve the shortcomings as described above. It is an important object of the present invention to provide an improved optimized program code generator for optimizing a program code for a processor capable of operating on the basis of either one of a plurality of instruction sets in a computer system.
It is another associated object of the present invention to provide an improved method for compiling a source text of the target program for a processor capable of operating on the basis of one of a plurality of instruction sets in a computer system.
It is another associated object of the present invention to provide a computer-readable medium containing improved computer-readable program code for optimizing a program code for a processor capable of operating on the basis of either one of a plurality of instruction sets in a computer system.
In brief, the above and other objects and advantages of the present invention are provided by a new and improved optimized program code generator for optimizing a program code for a processor capable of operating on the basis of either one of a plurality of instruction sets in a computer system, the optimization being carried out by the steps of: (a) reading said program code of a target program to be optimized, (b) estimating the costs of executable instruction sequences respectively to be obtained by translating said program code on the basis of said plurality of instruction sets; and (c) determining an optimum one of said plurality of instruction sets for translating said program code by evaluating the costs as estimated under a predetermined criteria.
In accordance with a further aspect of the present invention, a computer-readable medium containing computer-readable program code for optimizing a program code for a processor capable of operating on the basis of either one of a plurality of instruction sets in a computer system is provided with the optimization which is carried out by the steps of: (a) reading said program code of a target program to be optimized, (b) estimating the costs of executable instruction sequences respectively to be obtained by translating said program code on the basis of said plurality of instruction sets; and (c) determining an optimum one of said plurality of instruction sets for translating said program code by evaluating the costs as estimated under a predetermined criteria.
Also, in accordance with a preferred embodiment of the present invention, the step of estimating the costs of executable instruction sequences is carried out by the use of a cost estimation table.
Also, in accordance with a preferred embodiment of the present invention, said cost estimation table includes the cost of each code contained in said object code with respect to the plurality of instruction sets.
Also, in accordance with a preferred embodiment of the present invention, said program code of the target program to be optimized is assembly code.
Also, in accordance with a preferred embodiment of the present invention, the step of estimating the costs of executable instruction sequences is carried out by evaluating the object code corresponding to said program code.
Also, in accordance with a preferred embodiment of the present invention, said object code corresponding to said program code is relocatable.
Also, in accordance with a preferred embodiment of the present invention, said program code of the target program to be optimized is intermediate level language code as output from a parser.
Also, in accordance with a preferred embodiment of the present invention, said program code of the target program is written in Language C.
Also, in accordance with a preferred embodiment of the present invention, the optimized program code generator further comprises, between the step (a) and the step (b), a step (a2) of translating said program code of the target program written in Language C into assembly code by the use of said plurality of the instruction sets.
Also, in accordance with a preferred embodiment of the present invention, the optimized program code generator further comprises, between the step (a2) and the step (b), a step (a3) of detecting dependency of the respective instructions of the assembly code one from another.
Also, in accordance with a preferred embodiment of the present invention, an optimum one of said plurality of instruction sets for translating said program code is determined for each of a plurality of independent instruction sequences which consists only of instructions any one of which can be executed independent from any instruction of another independent instruction sequence.
Also, in accordance with a preferred embodiment of the present invention, the optimized program code generator further comprises, between the step (c) and the step (d), a step (c2) of determining the order of execution of the independent instruction sequences taking into consideration overhead as introduced by inserting an instruction set switching instruction.
Also, in accordance with a preferred embodiment of the present invention, an optimum one of said plurality of instruction sets for translating said program code is determined for each function included in said program code.
Also, in accordance with a preferred embodiment of the present invention, an optimum one of said plurality of instruction sets for translating said program code is determined for each the block of code included in said program code as separated by the branch instruction.
Also, in accordance with a preferred embodiment of the present invention, the step of estimating the costs of executable instruction sequences is carried out by taking into consideration overhead as introduced by inserting an instruction set switching instruction for switching instruction sets.
Also, in accordance with a preferred embodiment of the present invention, said predetermined criteria for determining an optimum one of said plurality of instruction sets is based upon comparison of the executable instruction sequences respectively to be obtained by translating said program code into said executable instruction sequences on the basis of different instruction sets in terms of the size of a memory space to be occupied by each of said executable instruction sequence.
Also, in accordance with a preferred embodiment of the present invention, said predetermined criteria for determining an optimum one of said plurality of instruction sets is based upon comparison of the executable instruction sequences respectively to be obtained by translating said program code into said executable instruction sequences on the basis of different instruction sets in terms of the number of clock cycles required for running each of said executable instruction sequences.
Also, in accordance with a preferred embodiment of the present invention, said predetermined criteria for determining an optimum one of said plurality of instruction sets is based upon comparison of the executable instruction sequences respectively to be obtained by translating said program code into said executable instruction sequences on the basis of different instruction sets in terms of the size of a memory space to be occupied by each of said executable instruction sequence and the number of clock cycles required for running each of said executable instruction sequences in combination with predetermined weights.
Also, in accordance with a preferred embodiment of the present invention, said predetermined criteria for determining an optimum one of said plurality of instruction sets is based upon comparison of the executable instruction sequences respectively to be obtained by translating said program code into said executable instruction sequences on the basis of different instruction sets in terms of the size of a memory space to be required for running said executable instruction sequence.
Also, in accordance with a preferred embodiment of the present invention, the size of a memory space to be required for running said executable instruction sequence is evaluated by sum of the static memory of said executable instruction sequence and the size of a memory space to be occupied by each of said executable instruction sequence.
Also, in accordance with a preferred embodiment of the present invention, the size of a memory space to be required for running said executable instruction sequence is evaluated by sum of the static memory of said executable instruction sequence, an expected stack memory of said executable instruction sequence, and the size of a memory space to be occupied by each of said executable instruction sequence.
Also, in accordance with a preferred embodiment of the present invention, the size of a memory space to be required for running said executable instruction sequence is evaluated by sum of the static memory of said executable instruction sequence, an expected stack memory of said executable instruction sequence, an expected heap memory of said executable instruction sequence, and the size of a memory space to be occupied by each of said executable instruction sequence.
Also, in accordance with a preferred embodiment of the present invention, the optimized program code generator further comprises, between the step (a) and the step (b), a step (a3) of inlining at least one function contained in said program code of the target program into the function which calls said at least one function.
Also, in accordance with a preferred embodiment of the present invention, said function inlined into the function which calls that function is a leaf function which does not call another function.
Also, in accordance with a preferred embodiment of the present invention, said function inlined into the function which calls that function is a leaf function which does not call another function and which is not called from different locations in said program code, the number of which different locations does not exceed a predetermined number.
In accordance with a further aspect of the present invention, a method for compiling a source text of the target program for a processor capable of operating on the basis of one of a plurality of instruction sets in a computer system, comprises the steps of (a) reading said source text of the target program of a target program to be compiled into a memory, (b) estimating the costs of object code respectively to be obtained by compiling said the source text of the target program on the basis of said plurality of instruction sets; (c) determining an optimum one of said plurality of instruction sets for translating said program code by evaluating the costs as calculated under a predetermined criteria; and (d) outputting the object as compiled in accordance with the determination of the optimum instruction sets.
Also, in accordance with a preferred embodiment of the present invention, the step of estimating the costs of the object code is carried out by evaluating the respective instruction contained in the object code obtained by compiling said the source text of the target program on the basis of each instruction set.
Also, in accordance with a preferred embodiment of the present invention, said object code corresponding to said program code is relocatable.
Also, in accordance with a preferred embodiment of the present invention, the step of estimating the costs of the object code is carried out by evaluating intermediate level language code as output from a parser.
Also, in accordance with a preferred embodiment of the present invention, the step of estimating the costs of executable instruction sequences is carried out by the use of a cost estimation table.
Also, in accordance with a preferred embodiment of the present invention, said cost estimation table includes the cost of each code contained in said object code with respect to the plurality of instruction sets.
Also, in accordance with a preferred embodiment of the present invention, an optimum one of said plurality of instruction sets is determined for each function included in said program code.
Also, in accordance with a preferred embodiment of the present invention, an optimum one of said plurality of instruction sets is determined for each the block of code included in said program code as separated by the branch instruction.
Also, in accordance with a preferred embodiment of the present invention, the step of estimating the costs of the object code is carried out by taking into consideration overhead as introduced by inserting an instruction set switching instruction for switching instruction sets.
Also, in accordance with a preferred embodiment of the present invention, said predetermined criteria for determining an optimum one of said plurality of instruction sets is based upon comparison of the assembly code respectively obtained on the basis of different instruction sets in terms of the size of a memory space to be occupied by each of said executable instruction sequence.
Also, in accordance with a preferred embodiment of the present invention, said predetermined criteria for determining an optimum one of said plurality of instruction sets is based upon comparison of the assembly code respectively obtained on the basis of different instruction sets in terms of the number of clock cycles required for running each of the object code.