1. Technical Field
The present invention relates to a compiler, more particularly to a method for eliminating or optimizing an array range check in a compiler. An array range check is a check on whether an array access in a program is exceeding its array range.
2. Prior Art
Several methods exist for eliminating an array range check based on the background art.
One such method is for checking the possibility of a range exceeding before a loop. (See xe2x80x9cElimination of Redundant Array Subscript Range Checksxe2x80x9d, Priyadarshan Kolte and Michael Wolfe, In proceedings of the ACM SIGPLAN ""95 Conference on Programming Language Design and Implementation, pp. 270 to 278, June 1995, etc.)
Table 1 in which 0 is the lower bound of an array and N is its size is modified to Table 2 as follows.
In the pseudocode of Table 1, 0 is assigned to each element of the array a. In the pseudocode of Table 2, an exception occurs in the case that the condition of if statement is fulfilled since the array access may exceed the array range, and processes it as in Table 1 in the case that it is not fulfilled.
The advantage of this method is that every array range check in a loop can be eliminated in the case that upper and lower bounds of the array access in the loop are certainly known. However, it has the following disadvantages as well. Namely, this method can only apply to a language whose specification defines that it is an error to exceed a range. Moreover, it can only apply when an array index in a loop changes monotonously. In addition, it cannot apply in the case that an ending condition of a loop cannot be put out of the loop, for instance, when end is a global variable, etc. in the above instance and end is changed by itself in the loop or by another thread.
A second method is for dividing a loop into three (See xe2x80x9cOptimizing Array Reference Checking in Java Programsxe2x80x9d, Samuel P. Midkiff, Jose E. Moreira, Mark Snir, IBM Research Report RC21184(94652), Computer Science/Mathematics, May 18, 1998, etc.)
This method divides a loop into three parts, namely a part not to be checked, a part for checking its lower bound, and a part for checking its upper bound. For instance, if the lower bound of an array is 0 and its size is N, Table 1 is modified to Table 3 as follows.
If divided into three in this way, in the second for-loop part, range checks can be omitted. The basic idea in this method is similar to method (1). The advantage of this method is that every array range check in a loop can be eliminated in the case that upper and lower bounds of the array access in the loop are certainly known. However, it can only apply when an array index in a loop changes monotonously. In addition, it cannot apply in the case that an ending condition of a loop cannot be put out of the loop, for instance, when end is a global variable, etc. in the above instance and end is changed by itself in the loop or by another thread. Furthermore, it requires special handling when applied to a large loop since the code size becomes three times larger.
A third method is for making array bases and indexes of the same value already checked (See the same documentation as method (1))
If there is an array access a[i] which is already checked, this method makes a[i] already checked within a range controlled from there and having the same values of a and i. Table 4 shows an example.
The advantage of this method is that it can apply to other places than a loop. However, it has a disadvantage that the range to be determined as already checked is small.
A fourth method is directed to for eliminating an array check by using a range of values of a variable (See xe2x80x9cIterative Type Analysis and Extended Message Splittingxe2x80x9d, CRAIG CHAMBERS, DAVID UNGAR, Optimizing Dynamically-Typed Object-Oriented Programs, etc.)
It is a method to narrow down a range of a variable from information such as if statement, and eliminate an array range check by using the information. For instance, if a lower bound of an array is 0, the part which has no check required written in its comment field is an array access to be determined as no check required by this method.
The advantage of this method is that it can apply to other places than a loop. Even if an expression of an array index is complicated as in method (1), it may be handled as already checked. However, in reality there are many cases in which a range of a variable cannot be narrowed down.
A fifth method is directed to eliminating an array check by using data-flow analysis (See xe2x80x9cOptimizing array bound checks using flow analysisxe2x80x9d, R. Gupta, ACM Letters on Programming Languages and Systems, 2(1-4), pp. 135 to 150, March-December, 1993, etc.)
This method eliminates an array range check by the following two-phased process. Namely, (1) Insert a check near the beginning in program execution order so as to decrease array range checks. (2) Eliminate redundant array range checks.
The advantage of this method is that it can apply to other places than a loop. However, it has its disadvantages, namely, the range in which it can eliminate array range checks is narrow and it can only apply to a language whose specification defines that it is an error to exceed a range.
An object of the present invention is to eliminate redundant array range checks by collecting array range check information by using data-flow analysis, etc. and moving up the checks. The redundant array range checks referred to here are those for an array access which can ensure that the array range check does not exceed its range because there is a preceding array access.
In Java (a trademark of Sun Microsystems) language, an exception occurs as its specification as a result of a range check at an array access. As this occurrence of an exception may be used to write a program, a program will not run correctly without performing array range checks. Another object of the present invention is to allow more array range checks to be eliminated by coping with a language in which occurrence of an exception may be used to write a program.
A further object of the present invention is to optimize an array range check by collecting array range check information through data-flow analysis, etc.
A still further object of the present invention is to perform a versioning for a loop by collecting array range check information on a predetermined condition.
To achieve the above-mentioned objects, this invention may be categorized into the following three parts. Namely, (A) a part to eliminate redundant array range checks by performing a versioning for a loop, (B) a part to optimize array range checks by performing data-flow analysis in reverse order of the program execution, and (C) a part to obtain information about array ranges already checked by performing data-flow analysis in program execution order and eliminate redundant array range checks from this information.
In (A), the following process is performed (FIG. 2 in the Embodiments). Namely, following steps are executed in performing a versioning for a loop by using-array range check information for an array access in a program; in each basic block, collecting and storing in a storage a first information about array range checks to be processed (C_GEN[B] in the Embodiments), in reverse order of the program execution according to a first condition (Table 8 in the Embodiments), wherein the first information is a set of array range checks; propagating the first information according to a second condition in order of a post-order traversal of a depth-first search (DFS) (Backward(C_OUT[B], B) (Table 9) and a process using it (FIG. 3) in the Embodiments), and generating and storing in a storage a second information about array range checks to be processed (C_IN[B] in the Embodiments) at the beginning of each basic block; and by using the second information, generating and storing in a storage a check code for the versioning before the loop and execution codes for each execution state. This divides into two execution states by check code classification, namely a loop without any array range check and a loop with array range checks, so processing becomes faster if execution shifts to a loop without any array range check.
The above-mentioned first condition may include conditions in a basic block, namely (1) if an index variable of an array access is not modified, collecting array range check information for the array access as it is; and (2) if an index variable in an array range check is modified by adding a positive or negative constant, collecting array range check information after reflecting the modification caused by adding the constant to the index variable. The latter condition expands the range of array range checks which can be handled.
The above-mentioned second condition may include a condition of: calculating a sum set of the first information about array range checks to be processed in a certain basic block and a fourth information about array range checks to be processed, wherein fourth information is a third information (C_OUT[B] in the Embodiments) about array range checks to be processed at the end of the certain basic block after being modified according to a third condition (backward(C_OUT[B], B) in the Embodiments).
The above-mentioned third condition may include a condition of: if, in the certain basic block, an index variable in an array range check included in the third information about array range checks to be processed is modified by adding a positive or negative constant, reflecting the modification caused by adding the constant to the index variable on the array range check included in the third information.
It is also possible that the third information about array range checks to be processed is generated by using the second information about array range checks to be processed of every basic block immediately after the certain basic block and included in the same loop as that of the certain basic block.
The above-mentioned collecting and storing step, if described in more detail, comprises the steps of: checking, in reverse order of the program execution, an instruction in the basic block; if the check determines the instruction includes an array access, storing in a storage information concerning an array range check necessary for the array access; if the check determines the instruction includes a modification of an array index variable associated with the stored array range check, determining whether the modification is an addition of a positive or negative constant; if the modification is an addition of the constant, calculating a modification of the array range check which is caused by the addition of the constant to the array index variable in the array range check; and storing in a storage an array range check after reflecting the modification of the array range check. This is a process which was not handled in the method (5) in the background art.
In the case of (B) described below, the collecting and storing step may comprise the following step. Namely, if the check determines the instruction causes a side effect due to any exception which is caused by an array range check and occurs earlier than the instruction, discarding the array range check stored before the check. It is because, in the case of (B), it is not possible to handle it if there is an instruction which causes a side effect.
Next, the case of (B) (FIG. 4 in the Embodiments) is described. Namely, following steps are executed in the case of optimizing an array range check for an array access in a program: in each basic block, collecting and storing in a storage a first information about array range checks to be processed (C_GEN[B] in the Embodiments) in reverse order of the program execution according to a first condition (Table 12 in the Embodiments), wherein the first information is a set of array range checks; propagating the first information through a data-flow analysis of the program by using information whether a side effect instruction so that a side effect is caused by moving an array range check issuing an exception before the side effect instruction is included in a basic block and according to a second condition (backward(C_OUT[B], Table 13, and FIG. 5) in the Embodiments), and generating and storing in a storage a second information about array range checks to be processed (C_OUT[B] in the Embodiments) at the end of each basic block; and in each basic block, generating and storing in a storage codes for array range checks by following each instruction in reverse order of the program execution with modification of the second information according to a third condition (Table 14 in the Embodiments) and by using the second information. While this process itself does not eliminate any array range check, it can be changed to more desirable array range checks by using it together with (A) or (C), or (A) and (C). It is also possible to combine it with a technique of a conventional technology.
The above-mentioned first condition may include conditions of, in a basic block: (1) if an index variable of an array access is not modified, collecting array range check information for the array access as it is; (2) if an index variable in an array range check is modified by adding a positive or negative constant, collecting array range check information after reflecting the modification caused by adding the constant to the index variable; and (3) if the basic block includes the side effect instruction, discarding array range check information collected in the basic block. The conditions of (2) and (3) were not previously taken up.
The above-mentioned second condition may include conditions of: (1) if a certain basic block is at the end of the program, or if the certain basic block is not at the end of the program and still includes the side effect instruction, propagating as information about array range checks to be processed at the beginning of the certain basic block the first information itself of the certain basic block; and (2) if the certain basic block is not at the end of the program and does not include the side effect instruction, propagating as the above information a sum set of a third information about array range checks to be processed and the first information of the certain basic block, wherein the third information is the second information of the certain basic block after being modified according to a fourth condition.
The above-mentioned third condition may include the conditions of: (1) if an index variable in an array range check is modified by adding a positive or negative constant, correcting to the array range check information after reflecting the modification caused by adding the constant to the index variable; and (2) if the basic block includes the side effect instruction, discarding array range check information collected in the basic block.
The above-mentioned generating and storing step may include the step of: if a range of an array range check for an array access is smaller than that of an array range check included in the second information, generating for the array access a code for the array range check included in the second information.
The above-mentioned fourth condition may include conditions of: if, in a certain basic block, an index variable in an array range check included in the second information is modified by adding a positive or negative constant, reflecting the modification caused by adding the constant to the index variable on the array range check included in the second information.
To describe in further detail the above-mentioned generating and storing step, in generating a code for an array range check to be inserted when optimizing an array range check in a program by using information about array range checks to be processed (C_OUT[B] in the Embodiments), wherein the information is a set of array range checks required for an array access and propagated to the end of each basic block, following steps are executed: checking, in reverse order of the program execution, an instruction in the basic block; if the check determines the instruction includes an array access, determining whether the range required for the array access is smaller than that of the array range check in the information; if it is determined to be smaller, generating a code corresponding to the array range check in the information; if the check determines the instruction includes a modification of an array index variable included in the information, determining whether the modification is an addition of a positive or negative constant to the array index variable; if the modification is an addition of the constant, storing the constant in a storage; if the modification is an addition of the constant, calculating a modification of the array range check which is caused by the addition of the constant to the index variable in the array range check; and storing in a storage the array range check after reflecting the calculated modification of the array range check. Information about array range checks to be processed includes the range to generate a code corresponding to the array range check, so a code for an optimum array range check is generated by transforming and using it.
It is possible to include the steps of: determining whether the instruction causes a side effect due to any exception caused by the array range check, wherein the exception occurs earlier than the instruction; and if the determination is true, discarding the information about array range checks to be processed. It shows a case which cannot be handled by this invention.
In (C), the following process is performed (FIG. 7 in the Embodiments). Namely, to eliminate a redundant array range check of array range checks in a program, following steps are executed: in each basic block, collecting a first information about array range checks already processed (C_GEN[B] in the Embodiments), in program execution order according to a first condition (Table 16 in the Embodiments), wherein the first information is a set of array range checks; propagating the first information along a data-flow of the program according to a second condition (Table 17 and FIG. 8 in the Embodiments), and generating a second information about array range checks already processed (C_IN[B] in the Embodiments) at the beginning of each basic block; and in each basic block, eliminating an array range check by following each instruction in program execution order with modification of the second information according to a third condition (Table 18 in the Embodiments) and by using the second information. It eliminates redundant array range checks by using data-flow analysis.
The above-mentioned first condition may include conditions of, in a basic block: (1) if an index variable of an array access is not modified, collecting array range check information for the array access as it is; and (2) if an index variable in an array range check is modified by adding a positive or negative constant, collecting array range check information after reflecting the modification caused by subtracting the constant from the index variable. As it includes the case of (2), it has a wider range of array range checks to be eliminated. The above-mentioned first condition may further include a condition of: collecting a range defined by upper and lower bounds which can be handled as already checked as to a constant index from a minimum constant offset and a maximum constant offset of an array index in the array range check and a lower bound of the array. It further expands the range of elimination.
The above-mentioned first condition may include a condition of: collecting the range defined by upper and lower bounds which can be handled as already checked as to a constant index from a lower limit value or a upper limit value of an index variable in the array range check and a lower bound of the array.
The above-mentioned second condition may include conditions of: (1) if a certain basic block is at the beginning of a program, propagating as information about array range checks to be processed at the end of the certain basic block a first information itself about array range checks already processed of the certain basic block; and (2) if the certain basic block is not at the beginning of the program, propagating as the above information a sum set of a third information about array range checks already processed and the first information of the certain basic block, wherein the third information is the second information of the certain basic block after being modified according to a fourth condition.
The above-mentioned third condition may include a condition of: if an index variable in an array range check is modified by adding a positive or negative constant, correcting to array range check information after reflecting the modification caused by subtracting the constant from the index variable.
The above-mentioned fourth condition may include a condition of: if, in the certain basic block, an index variable in an array range check included in the second information is modified by adding a positive or negative constant, reflecting the modification caused by subtracting the constant from the index variable on the array range check included in the second information. The above-mentioned step of eliminating array range checks (Table 18 in the Embodiments) is described in further detail as follows. Namely, to eliminate a redundant array range check of array range checks in a program, in selecting an array range check to be eliminated by using information about array range checks already processed (C_IN[B] in the Embodiments), wherein the information is a set of array range checks for an array access propagated to the beginning of each basic block, following steps are executed: checking, in program execution order, an instruction in the basic block; if the check determines the instruction includes an array access, determining whether the range of an array range check required for the array access is covered by that of the array range check included in the information; if it is determined to be covered, selecting an array range check required for the array access; if the above check determines the instruction includes a modification of an index variable of an array range check included in the information, determining whether the modification is an addition of a positive or negative constant to the index variable; if the modification is an addition of the constant, storing the constant in a storage; if the modification is an addition of the constant, calculating a modification of the array range check which is caused by subtracting the constant from the index variable in the array range check; and storing in a storage the array range check after reflecting the calculated modification of the array range check. This allows a wider range of array range checks to be eliminated.
The above-mentioned step of determining whether the range of an array range check is covered may include the steps of: checking if the index variables Ik (k=1, . . . n) are included as to array range checks with the same array base in the information about array range checks already processed; if it is determined that the index variable Ik for every k is included, determining whether the relation between constants L and n meets the predetermined condition; and if the relation between the constants L and n meets a predetermined condition, selecting an array range check of an array access whose array index is (I1+I2+. . . +In)/L. This allows a wider range of array range checks to be eliminated.
The step of determining whether the range of an array range check is covered may include the steps of: checking if a constant which has a value obtained by subtracting 1 from the absolute value of constant N is included as to array range checks on constant indexes in the information about array range checks already processed; if the constant is included and the information substantially includes the constant 0, determining whether A of the array index (A mod N) is positive; and if A is positive, selecting an array range check of an array access which includes the array index (A mod N). This allows a wider range of array range checks to be eliminated.
The collecting and storing steps of (A) and (B) and a part of processing in a second information about array range checks to be processed can be considered as processing for moving in opposite direction of the program execution an array range check to check that an array access in a program is not exceeding the array range, and to modify the array range check in this case, following steps are executed: determining whether the array range check has to move beyond the process of adding a positive or negative constant to an index variable of the array and storing the constant in a storage; if the determination is true, calculating the modification of the array range check caused by adding the constant to the index variable in the array range check; and storing in a storage the array range check after reflecting the calculated modification of the array range check.
Furthermore, the collecting steps of (C) and a part of processing in a second information can be considered as processing for moving in program execution direction an array range check to check that an array access in a program is not exceeding the array range, and to modify the array range check in this case, following steps are executed: determining whether the array range check has to move beyond the process of adding a positive or negative constant to an index variable of the array and storing the constant in a storage; if the determination is true, calculating the modification of the array range check caused by subtracting the constant to the index variable in the array range check; and storing in a storage the array range check after reflecting the calculated modification of the array range check.
To describe in further detail the characteristic processing in the collecting steps of (C), it is a process of collecting in a basic block of a program array range checks to check that an array access in the program is not exceeding the array range which can be handled as already checked. The process may comprises the steps of: detecting an array range check; storing in a storage the detected array range check; calculating and storing upper and lower bounds which can be handled as already checked as to a constant index from a minimum constant offset and a maximum constant offset of an array index in the detected array range check and a lower bound of the array; and storing in a storage the array range check on the range defined by the calculated upper and lower bounds.
To describe in further detail the characteristic processing in the array range check elimination steps of (C), it is a process of determining whether an array range check to check that the array access in a program is not exceeding the array range can be handled as already checked. The process may comprises the steps of: storing in a storage an array range check determined as already checked; checking if the index variables Ik (k=1, . . . n) are stored in a storage as to array range checks determined as already checked and having the same array base; determining whether the relation between constants L and n meets a predetermined condition; and if it is determined that the index variables Ik are stored for every k and the relation between the constants L and n meets the predetermined condition, storing in a storage an array access whose array index is (I1+I2+. . . +In)/L as already checked.
It is also possible, in the same process, to further execute the steps of: storing in a storage an array range check determined as already checked; checking if a constant which has a value obtained by subtracting 1 from the absolute value of the constant N is stored in a storage as to array range checks determined as already checked and on constant indexes; determining whether A of the array index (A mod N) is positive; and if the constant is stored in a storage and the constant 0 is substantially already checked (including either case of the lower bound of the array index being 0 or not) and the A is positive, storing in a storage the array index (A mod N) as already checked.
As above, the present invention has been represented as a flow of processing, while it is also possible to implement it by a computer or a computer program, or a dedicated circuit or device which execute the above process. In the case of implementing it by a computer program, the computer program may be stored on a storage medium such as a CD-ROM, a floppy disk or a hard disk.