1. Field of the Invention
The present invention generally relates to computer programming and, more particularly, to a method for a compiler, programmer or run-time system to transform a program so as to reduce the overhead of determining if an invalid reference to an array element is attempted, while strictly preserving the original semantics of the program.
2. Background Description
The present invention is a method for a compiler, program translation system, run-time system or a programmer to transform a program or stream of instructions in some machine language so as to minimize the number of array reference tests that must be performed while maintaining the exact semantics of the program that existed before the transformation.
A single-dimensional array A is defined by a lower bound lo(A) and an upper bound up(A). A[lo(A):up(A)] represents an array with (up(A)xe2x88x92lo(A)+1) elements. An array element reference is denoted by A["sgr"], where "sgr" is an index into the array, also called a subscript expression. This subscript expression may be a constant value at run-time, or it may be computed by evaluating an expression which has both constant and variable terms. For the reference A["sgr"] to be valid, A must represent an existing array, and a must be within the valid range of indices for A: lo(A), lo(A)+1, . . . , up(A). If A does not represent an existing array, we say that A is null. If "sgr" is not within the valid range of indices for A, we say that "sgr" is out of bounds. The purpose of array reference tests is to guarantee that all array references are valid. If an array reference is not valid we say that it produces an array reference violation. We can define lo(A)=0 and up(A)=xe2x88x921 for null arrays. In this case, out of bounds tests cover all violations. Array references typically (but not exclusively) occur in the body of loops. The loop index variable is often used in subscript expressions within the body of the loop.
The goal of our method and its variants is to produce a program, or segment of one, in which all array reference violations are detected by explicit array reference tests. This is achieved in an efficient manner, performing a reduced number of tests. The ability to perform array reference tests in a program is important for at least three reasons:
1. Accesses to array elements outside the range of valid indices for the array have been used in numerous attacks on computer systems. See S. Garfinkel and G. Spafford, Practical Unix and Internet Security, O""Reilly and Associates (1996), and D. Dean, E. Felton and D. Wallach, xe2x80x9cJava security: From HotJava to Netscape and beyondxe2x80x9d, Proceedings of the 1996 IEEE Symposium on Security and Privacy, May 1996.
2. Accesses to array elements outside the range of valid indices for the array are a rich source of programming errors. Often the error does not exhibit itself until long after the invalid reference, making correction of the error a time-consuming and expensive process. The error may never flagrantly exhibit itself, leading to subtle and dangerous errors in computed results.
3. Detection of array reference violations are mandated by the semantics of some programming languages, such as Java(trademark). (Java is a trademark of Sun Microsystems, Inc.).
Naively checking every array reference that occurs has an adverse effect on the execution time of the program. Thus, programs are often run with tests disabled. Therefore, to fully realize the benefits of array reference tests it is necessary that they be done efficiently.
Prior art for detecting an out of bounds array reference or a memory reference through an invalid memory address can be found in U.S. Pat. No. 5,535,329, U.S. Pat. No. 5,335,344, U.S. Pat. No. 5,613,063, and U.S. Pat. No. 5,644,709. These patents give methods for performing bounds tests in programs, but do not give methods for a programmer, a compiler or translation system for a programming language, or a run-time system to reduce the number of tests.
Prior art exists for the narrow goal of simply reducing the run-time overhead of array bounds testing, while changing the semantics of the program in the case where an error occurs. See P. Cousot and R. Cousot, xe2x80x9cAbstract interpretation: A unified lattice model for static analysis of programs by construction or approximation of fixpointsxe2x80x9d, Conference Record of the 4th ACM Symposium on Principles of Programming Languages, pp. 238-252, January 1977; W. H. Harrison, xe2x80x9cCompiler analysis for the value ranges for variablesxe2x80x9d, IEEE Transactions on Software Engineering, SE3(3):243-250, May 1997; P. Cousot and N. Halbwachs, xe2x80x9cAutomatic discovery of linear restraints among variables in a programxe2x80x9d, conference Record of the 5th ACM Symposium on Principles of Programming Languagesxe2x80x9d, pp. 84-96, January 1978; P. Cousot and N. Halbwachs, xe2x80x9cAutomatic proofs of the absence of common runtime errorsxe2x80x9d, Conference Record of the 5th ACM Symposium on Principles of Programming Languages, pp. 105-118, January 1978; B. Schwarz, W. Kirchgassner and R. Landwehr, xe2x80x9cAn optimizer for Adaxe2x80x94design, experience and resultsxe2x80x9d, Proceedings of the ACM SIGPLAN ""88 Conference on Programming Language Design and Implementation, pp. 175-185, June 1988; V. Markstein, J. Cocke and P. Markstein, xe2x80x9cElimination of redundant array subscript range checksxe2x80x9d, Proceedings of the ACM SIGPLAN ""82 Conference on Programming Language Design and Implementation, pp. 114-119, June 1982; R. Gupta, xe2x80x9cA fresh look at optimizing array bounds checkingxe2x80x9d, Proceedings of the ACM SIGPLAN ""90 Conference on Programming Language Design and Implementation, pp. 272-282, June 1990; P. Kolte and M. Wolfe, xe2x80x9cElimination of redundant array subscript range checksxe2x80x9d, Proceedings of the ACM SIGPLAN ""95 Conference on Programming Language Design and Implementation, pp. 270-278, June 1995; J. M. Asuru, xe2x80x9cOptimization of array subscript range checksxe2x80x9d, ACM Letters on Programming Languages and Systems, 1(2):109-118, June 1992; R. Gupta, xe2x80x9cOptimizing array bounds checks using flow analysisxe2x80x9d, ACM Letters on Programming Languages and Systems, 1-4(2):135-150, March-December 1993; and U.S. Pat. No. 4,642,765. These approaches fall into two major groups. In the first group, P. Cousot and R. Cousot, W. H. Harrison, P. Cousot and N. Halbwachs (both citations), and B. Schwarz et al., analysis is performed on the program to determine that an reference A2["sgr"2] in a program will lead to an out of bounds array reference only if a previous reference A1["sgr"1] also leads to an out of bounds array reference. Thus, if the program is assumed to terminate at the first out of bounds array reference, only reference A1["sgr"1] needs to be checked, since reference A2["sgr"2] will never actually perform an out of bounds array reference. These techniques are complementary to our method. That is, they can be used in conjunction with our method to provide some benefit, but are not necessary for our method to work, or for our method to provide its benefit.
In the second group, V. Markstein et al., R. Gupta (both citations), P. Kolte et al., J. M. Asuru, and U.S. Pat. No. 4,642,765, bounds tests for array references within loops of a program are optimized. Consider an array reference of the form A["sgr"] where the subscript expression "sgr" is a linear function in a loop induction variable. Furthermore, the value of "sgr" can be computed prior to the execution of each iteration. A statement is inserted to test this value against the bounds of the array prior to the execution of each iteration. The statement raises an exception if the value of "sgr" in a reference A["sgr"] is less than lo(A), or is greater than up(A). The transformations in this group can also use techniques from the first group to reduce the complexity of the resulting tests.
The weakness of the first group is that at least one test must remain in the body of any loop whose index variable is used to subscript an array reference. In general, since inner-most loops index arrays, and since the number of iterations greatly exceeds the number of operations within a single iteration, the overhead of bounds testing is linearly proportional to the running time of the program, albeit with a smaller constant term than in the unoptimized program (which tests all references). Second, the methods as described do not work in loops with constructs such as Java""s xe2x80x9ctry/catchxe2x80x9d block. If the methods are extended to work with these constructs, the scope over which redundant tests can be found will be reduced, and in general the effectiveness of the transformation will be reduced.
The second group overcomes these weakness, but at the expense of no longer throwing the exception at the precise point in program execution that the invalid reference occurs. For example, the exception for an out of bounds reference that occurs in an iteration of a loop is thrown, after the transformation, before the iteration begins executing. This can make debugging the cause of an invalid access more difficult. (Diagnostic code within the program set up to give debugging information might not execute after the transformation.) Also, for certain programming languages like Java, the resulting program almost certainly does not preserve the original semantics.
Finally, none of the methods in the two groups are thread safe. The methods of the two groups have no concept of actions occurring in other threads that may be changing the size of data objects in the thread whose code is being transformed. Thus, in programs which have multiple threads, the transformations may not catch some violations, and at the same time erroneously detect some non-existent violations.
The methods we describe overcome all of these objections. When our methods are applied to programs amenable to the techniques of the first group, the number of tests that do not result in detecting a violation is less than linearly proportional to the running time of the program. Our methods also handle xe2x80x9ctry/catchxe2x80x9d constructs. Our methods detect when an out of bounds reference is to occur immediately before the reference occurs in the original program semantics. Thus, the state of the program as reflected in external persistent data structures or observable via a debugger is identical to the state in the original program. Our transformations are also thread safe and efficient to execute. Moreover, they expose safe regions of code, code that is guaranteed not perform an invalid reference, to more aggressive compiler optimizations.
Next, we introduce some notation and concepts used in describing the preferred embodiment. We discuss the issues of multi-dimensional arrays and arrays of arrays. We then present our notation for loops, loop bodies, sections of straight-line code, and array references. Finally, we discuss some issues related to loops.
Arrays can be multi-dimensional. A d-dimensional array has d axes, and each axis can be treated as a single-dimensional array. Without loss of generality, the indexing operations along each axis can be treated independently. A particular case of multi-dimensional arrays are rectangular arrays. A rectangular array has uncoupled bounds along each axis. Some programming languages allow ragged arrays, in which the bounds for one axis depend on the index variable for another axis. Ragged arrays are usually implemented as arrays of arrays, instead of true multi-dimensional arrays. For arrays of arrays, each indexing operation can also be treated independently.
A body of code is represented by the letter B. We indicate that a body of code B contains expressions on variables i and j with the notation B(i,j). A body of code can be either a section of straight line code, indicated by S, a loop, indicated by L, or a sequence of these components.
Let L(i,l,u,B(i)) be a loop on index variable i. The range of values for i is [l,l+1, . . . ,u]. If l greater than u, then the loop is empty (zero iterations). The body of the loop is B(i). This corresponds to the following code:
do i=l,u
B(i)
end do,
which we call a do-loop. Let the body B(i) of the loop contain array references of the form A["sgr"], where A is a single-dimensional array or an axis of a multi-dimensional array. In general, "sgr" is a function of the loop index: "sgr"="sgr"(i). If the body B(i) contains xcfx81 references of the form A["sgr"], we label them A1["sgr"1], A2["sgr"2], . . . , Axcfx81["sgr"xcfx81].
In the discussion of the preferred embodiment, all loops have a unit stride. Because loops can be normalized, this is not a restriction. Normalization of a loop produces a loop whose iteration space has a stride of xe2x80x9c1xe2x80x9d. Loops with positive nonunity strides can be normalized by the transformation                                           do            ⁢                          xe2x80x83                        ⁢            i                    =                      l            i                          ,                  u          i                ,                  s          i                                                                          do            ⁢                          xe2x80x83                        ⁢            i                    =          0                ,                  ⌊                                                    u                i                            -                              l                i                                                    s              i                                ⌋                                        B        ⁡                  (          i          )                            ⇒                      B        ⁡                  (                                    l              i                        +                          is              i                                )                                        end        ⁢                  xe2x80x83                ⁢        do                                           end        ⁢                  xe2x80x83                ⁢        do            
A loop with a negative stride can be first transformed into a loop with a positive stride:                                           do            ⁢                          xe2x80x83                        ⁢            i                    =                      u            i                          ,                  l          i                ,                  -                      s            i                                                                                    do            ⁢                          xe2x80x83                        ⁢            i                    =                      l            i                          ,                  u          i                ,                  s          i                                        B        ⁡                  (          i          )                            ⇒                      B        ⁡                  (                                    u              i                        +                          l              i                        -            i                    )                                        end        ⁢                  xe2x80x83                ⁢        do                                           end        ⁢                  xe2x80x83                ⁢        do            
Loops are often nested within other loops. The nesting can be either perfect, as in                                           do            ⁢                          xe2x80x83                        ⁢            i                    =                      l            i                          ,                  u          i                                                                                            do            ⁢                          xe2x80x83                        ⁢            j                    =                      l            j                          ,                  u          j                                                       B        ⁡                  (                      i            ,            j                    )                                                       end        ⁢                  xe2x80x83                ⁢        do                                          end          ⁢                      xe2x80x83                    ⁢          do                ,                           
where all the computation is performed in the body of the inner loop, or not perfect, where multiple bodies can be identified. For example,                                           do            ⁢                          xe2x80x83                        ⁢            i                    =                      l            i                          ,                  u          i                                                                                            do            ⁢                          xe2x80x83                        ⁢            j                    =                      l            j                          ,                  u          j                                                                 B          1                ⁡                  (                      i            ,            j                    )                                                       end        ⁢                  xe2x80x83                ⁢        do                                                                     do            ⁢                          xe2x80x83                        ⁢            k                    =                      l            k                          ,                  u          k                                                                 B          2                ⁡                  (                      i            ,            k                    )                                                       end        ⁢                  xe2x80x83                ⁢        do                                          end          ⁢                      xe2x80x83                    ⁢          do                ,                           
has bodies B1(i,j) and B2(i,k) where computation is performed.
Finally, we note that standard control-flow and data-flow techniques (see S. Muchnick, Advanced Compiler Design and Implementation, Morgan Kaufmann Publishers, 1997) can be used to recognize many xe2x80x9cforxe2x80x9d, xe2x80x9cwhilexe2x80x9d, and xe2x80x9cdo-whilexe2x80x9d loops, which occur in Java and C, as do-loops. Many go to loops, occurring in C and Fortran, can be recognized as do-loops as well.
The present invention provides a method for reducing the number of array reference tests performed during the execution of a program while detecting all invalid references and maintaining the same semantics as the original program. The invention describes several methods for providing this functionality. The general methodology of all variants is to determine regions of a program execution that do not need any explicit tests, and other regions that do need tests. These regions may be lexically identical, and possibly generated only at run-time. (This is the case of loop iterations, some of which may need tests, and some of which may not.) The different methods and variants then describe how to generate code consisting both of sections with explicit tests and sections with no explicit tests. The regions of the program that are guaranteed not to cause violations execute the sections of code without tests. The regions of the program that can cause violations execute the sections of code with at least enough tests to detect the violation. The methods and variants differ in the number of sections that need to be generated, the types of tests that are created in the sections with tests, and the structure of the program amenable to the particular method.
In the most general form, an inspector examines the run-time instruction stream. If an instruction causes an array reference violation, an appropriate test instruction and the original instruction are sent to an executor for execution. The inspector may translate the instructions to a form more suitable for the executor.
The first major variant on the method works by transforming loops in the program. It essentially implements the inspector through program transformations. The transformation can either be performed at the source level by a programmer or preprocessor, or in an intermediate form of the program by a compiler or other automatic translator. For each loop, up to 5xcfx81 versions of the loop body are formed, where xcfx81 is the number of array references in the loop that are subscripted by the loop control variable. Each version implements a different combination of array reference tests. A driver loop dynamically selects which version of the loop to use for each iteration. A variant of this method uses compile-time analysis to recognize that some of the versions of the loop will never execute, and therefore these versions of the loop need not be instantiated in the program. Loop nests are handled by recursively applying the transformation to each loop in the nest.
The second major variant on the method also works by transforming loops in the program. The transformation can either be performed at the source level by a programmer or preprocessor, or in an intermediate form of the program by a compiler or other automatic translator. The iteration space of the loop to be transformed is divided into three regions having one of the following properties:
1. all array references using a loop control variable are valid;
2. all iterations whose value of the loop control variable is less than those contained in the first section; and
3. all iterations whose value of the loop control variable is greater than those contained in the first section.
We call the first region the safe region, and it is guaranteed not to generate a violation on those array references involving the loop control variable. The other two regions are unsafe as they may generate violations in those references. Appropriate versions (sections) of code are generated to implement each region. Tests are generated only for code sections implementing unsafe regions of the iteration space. The exact definition of the regions and the form of the code sections implementing them depends on many implementation options. In particular, it is possible to implement this method with only two distinct versions of code. Loop nests are handled by recursively applying the transformation to each loop in the nest. In some situations, it is possible to then hoist and coalesce generated code using standard techniques.
The third major variant of the method is more selective in applying the transformations of the second variant to loop nests. The transformations are applied always in outer to inner loop order. Within a loop that has been divided into regions, they are applied only to the regions with no tests.
The fourth major variant on the method works by transforming loops in the program. The transformation can either be performed at the source level by a programmer or preprocessor, or in an intermediate form of the program by a compiler or other automatic translator. This method can be applied to loop nests where each loop is comprised of
1. a possibly empty section of straight-line code;
2. a possibly empty loop; and
3. a possibly empty section of straight-line code.
The loop nest is divided into multiple regions, where each region either (1) has no invalid array references that use a loop control variable, or (2) has one or more invalid array references that use a loop control variable. A section of code with no array reference tests is created to implement regions of type (1). Another section of code with all necessary array reference tests is created to implement regions of type (2). A driver loop steps through the regions of the iteration space of the original loop, executing code in the generated sections as appropriate.
The fifth major variant extends the concept of versions of code to any sequence of instructions. Given a set of array references in a program, two versions of code are generated: (1) one version precedes the execution of each array reference with all the necessary tests to detect any violation, whereas (2) the other version performs no tests before array references. If any violations may occur during the execution of the set of references, then version (1) is executed. Otherwise, version (2) is executed.
Finally, the sixth major variant introduces the use of speculative execution to allow optimizations to be performed on code in which array reference tests are necessary. Given a set of array references in a program, two versions of code are generated: (1) a version which allows optimizations that version which does not allow these optimizations. The first version is do not preserve the state of the program when violations occur, and (2) a executed first, and its results are written to temporary storage. If no violations occur, then the results of the computation are saved to permanent storage. If array reference violations do occur, then the computation is performed using the second version. Therefore, the state of the program at the time of the violation is precisely preserved.