1. Field of the Invention
The present invention relates generally to an improved data processing system and in particular to a computer implemented method, apparatus, and computer usable program code for improving performance in a data processing system. Still more particularly, the present invention provides a computer implemented method, apparatus, and computer usable program code for simplifying compiler-generated software code.
2. Description of the Related Art
An analysis of the expression and data relationships in a computer program requires knowledge of the control flow of the computer program. Control flow is the sequence of execution of instructions in the computer program. Control flow is determined at run time by input data and by control structures used in the computer program, such as subroutine calls, loops, and “if” statements. A subroutine is a sequence of instructions for performing a particular task. Most programming languages allow the programmer to define subroutines, allow arguments to be passed to the subroutine, and allow one or more return values to be passed back. Understanding the control flow becomes more complicated due to conditional control structures, such as “if” statements. One of the primary reasons for conducting a control flow analysis for software code in a compiler is to produce more efficient computer programs. Compiler optimization techniques are optimization techniques that have been programmed into a compiler, and control flow analysis for software code is an example of a compiler optimization technique that has been programmed into many compilers. Control flow analysis for software code is a compiler optimization technique that analyzes the control of flow for software code. Code is a set of instructions for a computer in some programming language. The word code is often used to distinguish instructions from data. Software code is computer instructions stored in volatile storage in contrast to firmware instructions stored in non-volatile storage. Software code includes both source code written by humans and executable machine code produced by assemblers or compilers.
Optimization techniques are generally automatically applied by the compiler whenever they are appropriate. Because programmers no longer need to manually apply these techniques, programmers are free to write source code in a straightforward manner, expressing their intentions clearly. Then, the compiler can choose a more efficient way to handle the implementation details.
Control flow analysis for software code can become problematic for computer programs written in some programming languages when analyzing the use of arrays. An array is a collection of identically typed data items distinguished by their indices, or subscripts. The number of dimensions an array can have depends on the programming language implementing the array, but the number of dimensions is usually unlimited. In some programming languages, an array pointer is an address that can point to a contiguous array, that is, an array stored in contiguous storage, or a non-contiguous array, that is, an array stored in non-contiguous storage. Contiguous storage is storage that is physically adjacent on a disk volume or in memory, whereas non-contiguous storage is storage that is not physically adjacent on a disk volume or in memory.
Because an array pointer may be passed as an argument to a subroutine that has a dummy argument which expects a reference to contiguous storage, many compilers execute a gather/scatter operation. A subroutine requiring a gather/scatter operation is a subroutine that expects to receive a reference to a contiguous storage and a subroutine for which the array pointer points to a non-contiguous storage. A gather/scatter operation is an operation in which a compiler copies an array to a contiguous temporary array prior to the subroutine call, and passes a reference to the copy in the contiguous temporary array as the actual argument to the subroutine. A contiguous temporary array is a contiguous array that a stub routine, compiler, or a subroutine allocates for temporary use. After the call to the subroutine has completed, the compiler copies the contiguous temporary array back to the original array.
Such a gather/scatter operation is frequently expensive in terms of time and resources consumed. A gather/scatter operation may be superfluous in instances when the array is already contiguous. Some compilers eliminate the gather/scatter operation when the array is already contiguous, but a compiler cannot in general determine in advance if an array is contiguous. In order to determine if an array is contiguous, a compiler must insert inline code to check for the contiguity of an array at runtime. For example, the IBM® XL FORTRAN compiler and the Sun® FORTRAN compiler insert inline code to check for the storage contiguity of an array at runtime. An IBM® XL FORTRAN compiler is a product of International Business Machines Corporation, located in Armonk, N.Y. IBM is a trademark of International Business Machines Corporation in the United States, other countries, or both. A Sun® FORTRAN compiler is a product of Sun Microsystems, Inc., located in Santa Clara, Calif. Sun is a trademark of Sun Microsystems, Inc. in the United States, other countries, or both. These compilers are offered only as illustrative examples and not to imply any limitation for embodiments of the present invention.
The insertion of inline code to check the storage contiguity of an array at runtime is a common solution to the problem of superfluous gather/scatter operations. This runtime check, however, introduces other problems. For example, the inserted inline code may contain numerous conditional control structures, such as “if” statements, to determine if a gather/scatter operation is required. The inserted inline code may also contain complicated loop nests to execute a gather/scatter operation. The insertion of inline code may produce significant code bloating, which may negatively affect the length of time required for compilation and the amount of code compiled. Furthermore, the insertion of inline code may also negatively affect the runtime performance of the compiled code. Additionally, the insertion of inline code makes understanding the control flow much more complicated, hindering the operations of compilers used to generate more efficient code.