1. Field of the Invention
The present invention relates to a program transformation processing system and a program transformation processing method, and more specifically to a program transformation processing system and a program transformation processing method, for optimizing a language processing program such as a compiler.
2. Description of Related Art
A compiler is a language processing program for translating a source program written in a high level language into a machine language program which can be executed by a computer, namely, an object module (called "object" hereinafter). In general, a software processed by this kind of language processing program is required to be executed at a high processing speed. In order to realize a high speed execution, it is a necessary condition that the size of the object (which is the result of the language processing) is small and the execution speed of the object is high. To meet with these requirements, the language processing program has adopted various optimizing methods, for example, to delete or modify redundant operations on the object, and to replace an operation needing a relatively long execution time, with another kind of program requiring only a short execution time.
In particular, in connection with the extent to be optimized, the optimization is divided into a local optimization and a global optimization. In the local optimization, there is processed a program extent (the extend of a basic block) in which expressions and assignment statements are continuously executed with their order being not changed, without branching from the mid way of the program to an external and without branching from an external to the mid way of the program. On the other hand, in the global optimization, there is processed an extent exceeding the basic block, such as a syntax elements in which a branch occurs at a plurality of positions, for example, a compound statement, a for statement (repeating statement), a procedure, and a function. Since the global optimization is required to analyze and convert the program over a wide extent, the compiling time corresponding becomes long, but more sophisticated optimization can be realized. Therefore, the global optimization has a great advantage.
Referring to FIG. 1, there is shown a block diagram illustrating a conventional program transformation processing system configured to perform the global optimization. This conventional program transformation processing system includes a source file 1 for storing a source program, a language processing part 2 receiving the source program for executing the optimizing processing on the received source program so as to generate an object code, and an object file 9 receiving and storing the generated object code.
The language processing part 2 includes a syntax analysis part 3 receiving the source program for analyzing the syntax so as to generate intermediate codes corresponding to the source program, an intermediate file 4 receiving and storing the generated intermediate codes, a data-flow analyzing part 5 receiving the intermediate codes for executing a data-flow analysis, an optimization processing part 6 for performing the optimization processing on the basis of the result of the data-flow analysis, another intermediate file 7 for storing intermediate data which is the result of the optimization processing, and a code generator 8 receiving the intermediate data for generating object codes.
Next, the conventional program transformation processing method will be described with reference to FIG. 1. First, the syntax analysis part 3 receives the source program from the source file 1, and analyzes the content of the received source program and generates the intermediate codes by transforming the source program into a form which can be language-processed. The intermediate codes are stored in the intermediate file 4. Then, the data-flow analyzing part 5 receives the intermediate codes from the intermediate file 4, and executes the data-flow analysis which will be explained later. Thereafter, on the basis of the result of the data-flow analysis, the optimization processing part 6 performs various optimization processing, for example, common subexpression elimination, and register allocation. The intermediate data, which is the result of the optimization processing, is stored in the intermediate file 7. The code generator 8 receives the intermediate data from the intermediate file 7 and generates the object codes, which are stored in the object file 9.
Now, global data-flow analysis and optimization manner will be explained with reference to a conventional first program transformation method for performing a common subexpression elimination, described in A. V. AHO et al "Compilers: Principles, Techniques, and Tools", Addison-Welsley Publication Company, pages 627-631 and 633-634.
If two expressions, generally, two subexpressions, are equivalent to each other, it is possible to complete evaluation with one time evaluation processing. If a given subexpression is firstly evaluated, the result of the evaluation is stored as a temporary variable, and thereafter, the variable is substituted for the common subexpressions. This is called a "common subexpression elimination".
In the global data-flow analysis, the intermediate codes are read out from the intermediate file, are divided into basic blocks which is a unit, all statements included in which are continuously executed. Then, information concerning a flow of control for the processing execution is added to a set of basic blocks thus prepared, so that a flow graph is generated. The flow graph expresses a flow of processings in the form of a graph having a direction, in which each node is constituted of one string of statements or intermediate language sequentially executed, and a edge is formed by a flow of controls coupling between nodes.
FIG. 2 illustrates one example of a source program for which the optimization processing is performed. The source program shown in these figures contains a "if" statement in which an alternative selection (one out of two) is performed in accordance with a condition. Referring to FIG. 3A which illustrates a flow graph corresponding to the source program shown in FIG. 2, the source program shown in FIG. 2 is divided into four basic blocks B1, B2; B3 and B4. Here, a first block B1 is called an "initial node". As shown in FIG. 3A, each of the blocks B1 and B3 including the "if" statement is branched into two.
Then, information for available subexpressions, which is information required to eliminate common subexpressions, is sought. In the following, the available subexpression will be called an available expression. Here, referring to FIG. 3B which is a flow graph illustrating the available expression, an expression x+y is available at a point p of an arbitrary position, if every path from the initial node block B1B to the node p (set on the block B5B in this example) evaluates the expression x+y and after the last such evaluation prior to reaching the point p, there is no subsequent assignment to x or y. In addition, it is said that if x or y is assigned in a block B i including the expression x+y and the expression x+y is not subsequently recomputed, the block Bi kills the expression x+y. The block B5B in FIG. 3B comes under this block Bi. On the contrary, it is said that, if the block Bj evaluates the expression x+y and neither x or y is not subsequently redefined, the expression x+y is generated in the block Bj. The blocks B2B and B3B come under this block Bj.
A first processing for seeking the information for the available expression, is to read the previously prepared flow graph, and to seek a set of subexpression generated for each basic block (called a generated expression") and a set of subexpression killed for each basic block (called a "killed expression"). Referring to FIG. 4 which is a flow chart illustrating an algorithm for obtaining the generated expression set and the killed expression set, firstly, the generated expression set and the killed expression set are respectively initialized to an empty set (Step P1). Then, whether or not a subexpression to be considered is a final basic block is examined (Step P2). If it is the final basic block, the processing is ended. If it is not the final basic block, the subexpression to be considered is read (Step P3). For this subexpression to be considered ".smallcircle..smallcircle.=.DELTA..DELTA.op.quadrature..quadra ture.", ".DELTA..DELTA.op.quadrature..quadrature." is added into the generated expression set (Step P4). In addition, whether or not an expression involving ".smallcircle..smallcircle." is included in the generated expression set, is examined (Step P5). If it is included, the found-out expression is added into the killed expression set, and is eliminated from the generated expression set (Step P6). Thereafter, a next expression is considered (Step P7) so that the processing returns to the Step 2, again, in which the expression to be considered is a final basic block is examined. By repeating the above mentioned steps P2 to P7, the generated expression set and the killed expression set are sought, respectively.
Next, an available expression is sought. Here, it is defined that a whole set of subexpressions appearing at a right side of statements in a program is "U", a set of available expressions at the initial point in a block Bi is "in[Bi]", a set of available expressions at the final point in the block Bi is "out[Bi]", and the generated expression set and the killed expression set in the block Bi are e.sub.-- gen[Bi] and e.sub.-- kill[Bi], respectively. Under this definition, the following equation stands. EQU out[Bi]=e.sub.-- gen[Bi].orgate.(in[Bi]-e.sub.-- kill[Bi])
in[Bi]=.andgate.out[P] for B not initial, where P is a predecessor of Bi
in[B1]=.phi. where B1 is the initial block
As well known to persons skilled in the art, an algorithm for solving this equation is already established. Therefore, by solving the algorithm, it is possible to obtain a set of available expressions "in[Bi]" for each block Bi.
Elimination of global common subexpressions, which is the object of the processing, is performed as follows in the optimization processing part 6 using the information of available expressions obtained as mentioned above.
If the expression y+z is available at a head of a given block, and if the given block includes a statement "s" having the content x=y+z, a processing of the following steps K1 to K4 is performed. However, it is assumed that y or z is not defined from the head of the block just before the statement "s".
In the Step K1, in order to search an evaluation portion of the expression y+z for reaching the block including the statement "s", the edge of the flow graph is scanned from the block including the statement "s" in a revered direction. However, if a block evaluating the expression y+z is found out, a block or blocks prior to the found-out block are not scanned. In the block thus detected, the last evaluation of the statement w=y+z is the evaluation portion of the expression y+z reaching the statement "s".
Next, in the step K2, a new variable "u" is generated.
Then, in the step K3, the statement w=y+z detected in the step K1 is replaced with the following statements: EQU u=y+z
w=u
Succeedingly, in the step K4, the statement "s" is replaced with the statement x=u.
In the above mentioned processing, it is possible to eliminate the common subexpression using the global data-flow analysis. FIG. 3C illustrates the result of the global common subexpression elimination made for the flow graph shown in FIG. 3A.
Here, examine the number of the processing steps for the data-flow analysis in the above mentioned first conventional program transformation processing system and method. At least two times of the whole reading of the intermediate file 4 are required, one for the block dividing, and another for the optimization after the data-flow analysis. In addition, the flow graph is prepared by searching the block flow, and the generated expression set and the killed expression set are sought for all the blocks in order to obtain the available expression. Thereafter, the common subexpression elimination is performed on the basis of the obtained available expressions. These processing needs a great number of processing steps.
In order to perform the above mentioned processings, there is required a memory region for temporarily storing not only the intermediate codes but also data which is generally of a bit vector type for the above mentioned generated expression set and killed expression set. This data of the bit vector type is such that, the number "i" is allocated to an expression to be analyzed in the flow graph, and if the number "i" belongs to the set to be obtained, the position "i" of the bit vector is expressed as "1". Accordingly, the data length of the bit vector, in one block, for example for the generated expression set is at least the bit length of the same number as the number of the generated expressions in the block concerned. For the killed expression set and the set of available expressions at each of the block entry and outlet, data of the bit vector type is formed similarly.
Even if an optimizable portion is resultantly an extremely small part of the source program, it cannot be predicted in the optimization processing part 6. Therefore, since it is necessary to perform the processing for all the blocks without exception, all data to be processed are held on the memory region. As a result, the memory region is required to have a sufficient margin for storing these data.
In order to efficiently execute the common subexpression elimination, Japanese Patent Application Laid-open Publication JP-A-64-03737 has proposed a second conventional program transformation method, which is, however, directed to a C-language compiler. In this second conventional program transformation method, address calculation and subscript expression are searched using an analysis tree, and if two common expressions giving the same operation result is found out, one of the found-out common expressions is eliminated.
However, this method is limited to only the address calculation and the subscript expression in the basic block, and no consideration is made onto the case exceeding the basic block and on other general subexpressions.
As seen from the above, the above mentioned first conventional program transformation processing system and method require a great number of processing steps for the data-flow analysis, and therefore, needs a long processing time.
In addition, a large capacity memory region is required for temporarily storing not only the intermediate codes but also various data in the course of the processing, such as the generated expression set and killed expression set corresponding to the basic block and the set of available expressions.
If the memory region does not have a sufficient margin, the optimization becomes impossible in certain cases. In this case, the size of the generated object may become large, and the processing speed may become low.
On the other hand, the above mentioned second conventional program transformation processing system and method is disadvantageous in that one to be processed is limited to the address calculation and the subscript expression, and no consideration is made onto the case exceeding the basic block and on other general subexpressions.