1. Field of the Invention
The present invention relates generally to a program transformation method, a program transformation system and a storage medium storing a program transformation program. More particularly, the invention relates to a program transformation method and a program transformation system for transforming (compiling) a source program described by a programming language into an object program described by a language (machine language, assembly language and so forth) executable by a computer, a central processing unit (CPU) and the like.
2. Description of the Related Art
FIG. 15 is a block diagram showing an example of a construction of the conventional program transformation system disclosed in Japanese Unexamined Patent Publication No. Heisei 1-118931.
The program transformation system illustrated in FIG. 15 is constructed with a first program storage portion 151, a compiler 152, a second program storage portion 153, a third program storage portion 154, an input data storage portion 155, a program executing portion 156, a fourth program storage portion 157 and a parsing result storage portion 158.
At first, the compiler 152 reads out a source program described by a programming language, such as C language and so forth from the first program storage portion 152, temporarily generates an object program described by a machine language, an assembly language and so forth, and stores the temporarily generated object program in the second program storage portion 153.
Here, the temporarily generated object program is the program generated by transforming the source program into codes of machine language, assembly language or so forth in a sequential order of description. While the temporarily generated object program is executable by the computer, the central processing unit (CPU) and so forth, since the source program is simply transformed into the codes in a sequential order of that in the source program, it inherently has redundant portions to make the size (code size) of the overall object program large as held in the temporarily generated form. Therefore, a large storage capacity is required in a primary storage device which is adapted to store the temporarily generated object program. Furthermore, an execution period of the object program becomes long to lower efficiency.
Therefore, it becomes necessary to generate an efficient and optimal object program. The object program simply transformed into the codes from the source program in a sequential order described in the source program in the process set forth above, will be hereinafter referred to as "temporary object program" distinguishing from an optimized final object program.
There are various method for optimizing the object program. Here, arrangement optimization of instruction codes in a procedure. The procedure means a group of processes, such as arithmetic operation, to be executed by the computer or CPU and is often called as function or sub-routine. Throughout the disclosure and claims, the group of processes will be generally referred to as "procedure".
In a program, it can become necessary to call other procedure (hereinafter referred to as "callee" side procedure) in execution of some procedure (hereinafter referred to as "caller" side procedure) at a certain portion of the program. Therefore, when the source program is transformed into the object program and the resultant object program is stored in the primary storage device, if an instruction code of the caller side procedure and an instruction code of the caller side procedure closely related to the former are physically arranged close with each other, a procedure call instruction can be changed from that for long jump to that for short jump.
By this, the code size of the overall object program can be reduced. In conjunction therewith, an execution speed upon executing the object program in the computer or the CPU can be higher. Arranging of the instruction codes having high possibility to be sequentially executed in time at physically close positions on the object program is called as arrangement optimization of the instruction codes of the procedures.
Next, the program executing portion 156 reads out a procedure call frequency parsing program from the fourth program storage portion 157 and executes the same. Namely, the program executing portion 156 reads out the temporary object program from the second program storage portion 153. In conjunction therewith, an input data stored in the input data storage portion 155 input by an operator is read out by the program executing portion 156. Then, the program executing portion 156 simulates execution of the temporary object program and, in conjunction therewith, integrates number of times of occurrence of call of other procedures in a certain procedure in the temporary object program. A result of integration is stored in the parsing result storage portion 158 as a procedure reference frequency parsing result.
By this, the compiler 152 reads out the procedure reference frequency parsing result from the parsing result storage portion 158 to calculates closeness of reference relationship between arbitrary two procedures. On the basis of a resultant closeness, arrangement optimization of the instruction code is performed to generate the final objective program to store in the third program storage portion 154.
On the other hand, FIG. 16 is a block diagram showing an example of a construction of the conventional program transformation system disclosed in Japanese Unexamined Patent Publication No. Heisei 9-34725.
The program transformation system illustrated in FIG. 16 is constructed with a source program storage portion 161, a compiler 162 and an object program storage portion 163, in general.
The compiler 162 is generally constructed with a parsing portion 164, a procedure call occurrence counting portion 165, a code generating portion 166, a procedure call count data storage portion 167, a special space arranged procedure determining portion 168, an object program outputting portion 169. Here, a special space means a special region of a finite code size set in a part of a program space.
The parsing portion 164 reads out the source program to be parsed from the source program storage portion 161 and parses a syntax forming the source program. The procedure call count portion 165 counts number of times of call of respective procedure per procedure recognized by the parsing portion 164 upon parsing the syntax.
The code generating portion 166 performs code generation twice. Namely, at first code generation, the code generating portion 166 generates a normal code if the syntax is not the procedure call instruction, and generates an instruction code using normal call instruction if the syntax is the procedure call instruction, on the basis of the result of parsing of the parsing portion 164. On the other hand, the code generating portion 166 scans the results of code generation in the first time from the leading end in the second code generation. Then, if the code is the procedure call instruction and, a result of inquiring to the special space arranged procedure determining portion 168 shows that the procedure is a special space arranged procedure determined to be arranged within the special space, a normal call instruction code having large byte count is replaced with a dedicated call instruction code having smaller byte count.
The procedure call count data storage portion 167 stores a call count counted by the procedure call occurrence counting portion 165 per procedure and a code size of the code generated in the first code generation. The special space arranged procedure determining portion 168 selects and determines a procedure to be arranged within the special space with providing preference for the procedure having greater call count so that a sum of the code sizes of the procedures to be arranged within the special space falls within a code size of the special space on the basis of call count and code side per procedure stored in the procedure call count data storage portion 167.
The object program output portion 169 outputs the code to a segment added an arrangement attribute to the special space when a result of inquiry to the special space arranged procedure determining portion 168 shows the code generated by the code generating portion 166 is the code of a definition portion of the special space arranged procedure, and when the code generated by the code generating portion 166 is not the code of the definition portion of the special space arranged procedure, a normal segment is output. Here, the segment means a group of codes as minimum unit of arrangement when the code is arranged within the program space.
As set forth above, the object program output portion 166 separates the special space arranged procedures and the normal procedures. Next, the object program output portion 169 outputs data of parameter region or so forth, outputs a code portion and a data portion in combination as object program, and stores in an object program storage portion 163.
With the construction set forth above, the code size of the generated object program can be reduced. Associating with this, the program space can be saved. Also, the execution speed upon execution of the object program by the computer or the CPU can be higher.
On the other hand, in the conventional program transformation system disclosed in Japanese Unexamined Patent Publication No. Heisei 1-118931, since an object to perform arrangement optimization of the instruction code of the procedure is only procedure which the user defines in the source program, improvement of efficiency of the object program is limited.
On the other hand, in the conventional program transformation system disclosed in Japanese Unexamined Patent Publication No. Heisei 9-34725, since the special space has a finite code size, the procedures to be arranged within the special space are limited. Therefore, improvement of efficiency of the object program is limited.
On the other hand, when the object program generated by the program transformation system is to be executed by a one-chip microcomputer consisted of CPU, decoder and so forth, the object program is stored in the external primary storage device so that each code of the object program is read out sequentially from the primary storage device. Then, after decoding by the decoder, the CPU parses the object program for execution. In this case, in order to speed-up the execution speed of the CPU, a cache memory for temporarily having small storage capacity and high access speed and storing the codes read out from the primary storage memory which normally has large storage capacity and low access speed, is provided in the one-chip microcomputer.
In the one-chip microcomputer provided with the cache memory, when the CPU executes the code, each code in the object program read out from the primary storage device cannot be decoded by the decoder and parsed and executed by the CPU until it is once stored in the cache memory. In the one-chip microcomputer of this kind, there are various method to store each code read out from the primary storage device in the cache memory. Amongst, a direct map method is one of the method for storing each code in the cache memory.
As shown in FIG. 17, in the direct map method, a cache memory 171 is divided into a plurality of storage regions (hereinafter referred to as cache lines). In conjunction therewith, each storage region of the primary storage device 172 is also divided. Each storage region of the primary storage device 172 is established correspondence to each cache line of the cache memory 171.
In FIG. 17, the cache memory 171 is consisted of five cache lines 171a to 171e. Corresponding to these, the primary storage device 172 is divided into storage regions each having the same storage capacity to that of each cache line. Each storage region is corresponded to respective five cache lines 171a to 171e with taking five as a unit. Namely, storage regions 172-1a to 172-first embodiment of the primary storage device 172 are corresponded to the cache lines 171a to 171e as a group. Similarly, the storage regions 172-2a to 172-second embodiment are corresponded to the cache lines 171a to 171e. Final storage regions 172-na to 172-ne (n is natural number) are also corresponded to the cache lines 171a to 171e.
When the object program to be executed by one-chip microcomputer employing the direct map method is to be generated using the program transformation system, the following drawbacks should be encountered.
For example, when the source program described by C language shown in FIG. 18 is transformed into the object program by the program transformation system, as shown in FIG. 19, respective instruction codes of procedure func_A and func_B are stored in the primary storage device 172.
In FIG. 19, the instruction code of the procedure func_A is stored in the storage regions 172-1a to 172-1c of the primary storage device 172. Also, the instruction code of the procedure func_B is stored in the storage regions 172-2a _to 172-2b of the primary storage device 172. Accordingly, the instruction code of the procedure func_A is corresponded to the cache memory line 171a to 171c of the cache memory 171. On the other hand, the instruction code of the procedure func_B is corresponded to the cache line 171a and 171b of the cache memory 171.
In such case, when the CPU executes the object program generated by transformation of the source program shown in FIG. 18, the instruction code of the procedure func_A is read out from the storage regions 172-1a to 172-1c of the primary storage device 172 and is once stored in the cache lines 171a to 171c in the cache memory 171, and thereafter decoded by the decoder and parsed and executed by the CPU.
Next, the instruction code of the procedure func_B is read out from the storage regions 172-2a to 172-2b of the primary storage device 172, and temporarily stored in the cache lines 171a to 171b of the cache memory 171. Here, while a part of the instruction codes of the procedure func_A has already stored in the cache lines 171a and 171b of the cache memory 171, the instruction code of the procedure func_B is stored there over (overwritten). Therefore, a part of the instruction code of the procedure func_A cannot be read subsequently. Thereafter, the instruction code of the procedure func_B stored in the cache lines 171a to 171b of the cache memory b171 are decoded by the decoder and parsed and executed by the CPU.
Next, by the source program shown in FIG. 18, the instruction code of the procedure func_A has to be executed again. However, since the instruction code of the procedure func_B is already stored in the cache lines 171a and 171b of the cache memory 171, a part of the instruction code of the procedure func_A cannot be read out. Therefore, the instruction code of the procedure func_A is again read out from the storage regions 172-1a to 172-1b of the primary storage device 172. Then, the instruction code of the procedure func_A is temporarily stored in the cache lines 171a to 171b of the cache memory 171, decoded by the decoder and parsed and executed by the CPU.
As set forth above, when the instruction codes of two procedures which have high possibility to be executed sequentially in time, are stored in the storage regions of the primary storage device 172 corresponding to the same cache lines of the cache memory (this will be referred to as being loaded on the same cache line), all or a part of the instruction codes stored in the cache memory 171 read our from the primary storage device preliminarily, is overwritten by the subsequently written instruction codes of the procedure read out from the primary storage device and written on the same cache line of the cache memory 171. Such condition is referred to as conflict (cache conflict). If such conflict is caused frequently, effect of the cache memory for speeding up execution speed of the CPU can be negated. More worsely, it is possible to cause slow down of the execution speed of the CPU.
As methods for storing the cache memory of each code read out from the primary storage device, there are a fully associative method which permits storing of data of the primary storage device to any of the cache line on the cache memory, a set associative method as an intermediate method of the direct map method and the fully associative method and a plurality of cache lines of the cache memory to be arranged the data of the primary storage device are present, and so forth may be used in addition to the direct map method. As set forth above, it is possible to cause conflict of procedures on the cache memory.
In the conventional program transformation systems disclosed in Japanese Unexamined Patent Publication No. Heisei 1-118931 and Japanese Unexamined Patent Publication No. Heisei 9-34725, no consideration has been given for conflict as set forth above. Therefore, as a result of arrangement optimization of the instruction code of the procedure or arrangement of the procedure in the special space, if the instruction codes of two procedures having high possibility to be executed sequentially in time are loaded on the same cache line of the cache memory 171, conflict is inherent. Accordingly, even if the code size of the overall object program can be deleted, execution speed of the CPU cannot be accelerated.
On the other hand, the procedure to be frequently used in execution of the object program cannot speed up the execution speed of the CPU by reading the instruction code from the primary storage device 172 and storing in the corresponding cache lines of the cache memory every time of use, since the instruction code is not stored in the cache memory 171, the instruction code cannot be read out for conflict (these are generally called as cache miss). Therefore, it becomes necessary to store the frequently used procedure in the cache memory 171 as long as possible without causing conflict.
However, in the conventional program transformation systems disclosed in Japanese Unexamined Patent Publication No. Heisei 1-118931 and Japanese Unexamined Patent Publication No. Heisei 9-34725, nothing is considered with respect to the cache miss in execution of the object. Accordingly, even in this point, execution speed of the CPU cannot be speed-up.