1. Field of the Invention
The present invention generally relates to data processing systems. In particular, methods and systems in accordance with the present invention generally relate to optimization of executable code.
2. Background
Computers are increasingly important in today's society, and software used to control computers is typically written in a programming language. C, C++ and other similar variations are widely used programming languages. The programming language C is described, for example, in detail in Al Kelley et al., “A Book on C,” Addison-Wesley, 1997, which is incorporated herein by reference. In developing software, typically a software developer writes code, referred to as “source code,” in a programming language, and the source code is compiled by a compiler into “object code” that can be run by a machine. This code is also referred to as executable code.
Although it may be easy to generate executable code that works, it may be difficult to generate executable code that operates optimally. Code that may produce correct results may do so in a manner that is inefficient, use too many system resources or take too long to execute properly. The problem of generating optimal executable code from a given source code may be generally difficult. Optimal execution may refer, for example, to obtaining peak speed, highest scalability of a parallel program, shortest time to solution, sharpest intervals in the result, smallest memory footprint, most efficient use of parallel processors, etc. Optimal execution may also refer to other performance areas.
Although the problem of optimization may be generally difficult, executable code generated directly from a specification such as a source code file, intermediate representation, or an object file can often be substantially improved by compilers, binary optimizers, and other tools that attempt to produce executable code. The process of attempting to generate code approaching an optimal solution may be referred to as optimization. Optimization is also made difficult by the absence of information required for a true optimum to be reached. Such information may include, for example, the probability with which a particular decision will be made one way or another, the frequency with which particular data will be used, the size of a particular data set and many other factors that inhibit optimization.
One solution to the problem of insufficient information, referred to as “profile feedback,” is to create an executable program that contains “instrumentation” and then recompile with the results of the instrumentation. The instrumentation may be additional code that gathers some of the information useful in optimization. For example, given a decision point for which it is useful to know the probability with which particular decisions are made, one form of instrumentation is counters on all possible decisions. The instrumented program is run, data from the counters is saved, and that data may be used by a recompilation of the program to optimize the code around the decision point.
Profile feedback has numerous substantial drawbacks that prevent its widespread adoption. For example, it may be intrusive and change the character of the program that it is measuring. This means that the result of thorough instrumentation is to gather instrumentation on the instrumented program rather than on the program of interest. Another difficulty is making the choice between high instrumentation, which is intrusive and can substantially slow the program execution, or low instrumentation, wherein necessary information may not be gathered. Profile feedback may also require a second compilation phase which may be expensive on time and system resources.
Another solution for optimization is statistic analysis. In this method, the program is analyzed without benefit of any information except what is expressed in the representation of the program. Algorithms are applied to try to decide various optimization-related questions. The algorithms are often quite expensive and error-prone and often fail to decide the question. Static analysis is an ordinary process that a compiler applies to try to analyze a program. Static analysis is in contrast to dynamic analysis, which is a type of analysis that occurs at run-time and uses information generated by observing the dynamic run-time environment.
By contrast, static analysis has available the information available to the compiler at the time the program is compiled. Consider the following example:
DO 20, I=1, N                DO 10, J=1, M                    X=A(I)*B(J)/2.010 END DO20 END DO                        
In this example, the static compiler can determine that it is better to multiply the 0.5 instead of divide by 2.0 because multiplication is faster than division. However, it is best to have the loop with the highest iteration count as the outer loop, and the static compiler cannot guess from analyzing this code segment whether it should accept the code as-is or interchange the nesting order of the loops.
Yet another optimization solution is directives. In this method, the information in the program is augmented with directives conveying information that may be useful in optimization. These directives are typically stored in the source code as comments. The strengths of this system are that it is a simple way of providing optimization-related information to the compilation system. However, directives have several drawbacks. For example, the directives are typically written by a human, which can be a time-consuming and error-prone process. Additionally, the directives typically provide a very small subset of the information required for best optimization. The directives are typically not portable between vendors, compilers, or over time. In practice, this means that the directives are not up-to-date for any particular target environment.
A directive may be information embedded in a comment that tells the compiler something interesting about the program. For example:
!$OMP PARALLEL DO
DO 10, I=1, N                CALL SUBR X(I))        
10 END DO
!$OMP END DO NO WAIT
The “!” character indicates a comment, for example, in Fortran. When this code is presented to a compiler that does not understand the directives, it treats them as comments and ignores them. When this code is presented to a compiler that does understand the directives, then it realizes that the first directive indicates that it is allowed to parallelize the loop. The second directive (END DO NO WAIT) indicates that the compiler is allowed to generate code that does not force each parallel thread to wait for the others at the end of the loop. Ordinarily, the threads would finish their work and then wait for the others before proceeding. The NOWAIT tells them to proceed without waiting.
Therefore, a need has long existed for a method and system that overcome these and related problems.