Whole program analysis enables an aggressive form of optimization that is applied on a full program basis. The goal of whole program analysis is to analyze substantially the entire program during the compilation phase to obtain the most effective optimization possible. One difficulty with whole program analysis is that the compiler used to compile the program normally does not have access to the entire program and, therefore, all of the information it needs to optimize the program. Instead, the compiler typically only “sees” the program files that are provided to the compiler by the programmer (i.e., user). Accordingly, the compiler normally cannot take into account any information contained in, for example, previously compiled object files of a library or a separate load module. Without having access to this information, the compiler cannot identify all the different relationships between the various portions of the program, and therefore cannot perform the most efficient optimization.
As an example, the existence of all alias relationships normally cannot be determined where libraries or real object files already exist that are unknown to the compiler. Because of this fact, it cannot be determined with any certainty whether a given global variable may be accessed through a pointer, i.e. whether the variable's address is exposed or not. Therefore, the global variable must be reloaded into memory each time it is used if there is an indirect memory store instruction before this use, thereby requiring execution time that otherwise would not be necessary if the compiler could confirm that the global variable is not so exposed.
In addition, a compiler normally cannot determine whether a given global variable will be modified by an existing library or other program feature that the compiler cannot see. Accordingly, a global variable having a given, unchanging value may need to be referenced with an address each time it is encountered even though it could simply be replaced with a constant. Such referencing not only slows execution speed, but further wastes memory space in having to store the instructions and the address related to the variable.
Another piece of information relevant to global variables that normally cannot be determined by a compiler is whether an assigned variable is not ever used in the program. Without this information, unused variables and instructions that pertain to them cannot be removed from the program, again slowing execution speed and wasting memory space.
In addition to the optimization limitations pertinent to global variables, conventional systems furthermore cannot facilitate external function call optimization. In particular, the compiler typically cannot determine whether a given function is defined in an existing library or other program feature and, if so, whether its function call is preemptible. If it were ascertainable that a given function call is preemptible, the compiler could optimize the program by inlining the function call stubs to reduce the number of references necessary to branch to the function, thereby increasing execution speed.
In recognition of the limited amount of optimization that is obtainable using conventional techniques, several solutions have been proposed. In one such solution, aggressive assumptions are made as to the nature of the program that is to be compiled and are applied by the compiler during the compilation process. The problem with this approach, however, is that it is only as accurate as the assumptions that are made. Accordingly, if the assumptions are wrong, the program may not be optimized to its greatest extent or, in some cases, compilation errors will be encountered.
In another solution, attempts are made to approximate whole program analysis by manually creating a database for various libraries that contain object files. The compiler is configured to query the database for information about the object files and, presumably, uses this information to optimize the program. This approach fails to provide true whole program analysis, however, in that the database is built when the various program libraries are built and therefore can only provide information as to known system libraries. Moreover, this solution is undesirable from an efficiency standpoint in that it is manually intensive.