At present, there are two common steps involved in constructing an application which will run on a computer. The first step is the compilation phase which accomplishes a translation of the source code to a set of object files written in machine language. The second step is the link phase which combines the set of object files into an executable object code file. Almost all code generation and optimization decisions are made during the compilation phase and the link phase primarily relocates code and data, resolves branch addresses and provides binding to run-time libraries.
Today, most modern programming languages support the concept of separate compilation, wherein a single computer source code listing is broken up into separate modules that can be fed individually to the language translator that generates the machine code. This separation action allows better management of the program's source code and allows faster compilation of the program. The separate code modules will hereafter be referred to synonymously as either "modules" or "compilation units" (CUs).
The use of CUs during the compilation process enables substantial savings in required memory in the computer on which the compiler executes. However, such use limits the level of application performance achieved by the compiler. For instance, optimization actions that are taken by a compiler are generally restricted to procedures contained within a CU, with the CU barrier limiting the access of the compiler to other procedures in other CUs. This limitation is of significance when attempting to accomplish either in-lining or cloning, as the selection of call-sites is restricted at which these optimizations can be performed.
In-lining replaces a call site with the called routine's code. In-line substitution serves at least two purposes: it eliminates call overhead and tailors the call to the particular set of arguments passed at a given call site. Cloning replaces a call site with a call to a specialized version of the original called procedure. Cloning allows for constant arguments to be propagated into the cloned routine. More specifically, cloning a procedure results in a version of the called procedure that has been tailored to one or more specific call sites, where certain variables are known to be constant on entry.
Importantly, modular handling of routines by the compiler creates a barrier across which information, which could be of use to the compiler, is invisible.
It has been recognized in the prior art that making cross-modular information available during the compilation action will improve application performance. Thus, a compiler which can see across modular barriers can achieve significant benefits of inter-procedural optimization and achieve noticeable gains in performance of the resulting application.
Loeliger et al. in a paper entitled "Developing an Inter-procedural Optimizing Compiler", ACM SIGPLAN Notices, Vol 29, No. 4. April 1994, pp41-48, describe how a compiler developed for use in the C-series Supercomputers (marketed by the Convex Computer Corporation) enables inter-procedural optimization. Initially, a series of passes are made over a database that contains information about all of the procedures in the application. A number of analyses are performed to provide information (where traditional compilers make worst-case assumptions). For instance, the database is analyzed to determine which procedures are invoked by a call (call analysis); which names refer to a same location (alias analysis); which pointers point to which locations (pointer tracking); which procedures use which scalars (scalar analysis); which procedure should be in-lined at which call sites (inline analysis); etc., etc.
The results of these analyses, i.e. a "profile feedback", are then employed during the compile action to achieve application improvement. Little description is made available by Loeliger et al. regarding how the actual "build" process utilizes the profile feedback information achieved during the database analysis. Further, the Loeliger et al. process is not compatible with a widely used "make" utility, available in many operating systems. For instance, in the UNIX operating system, the "make" utility enables the construction of a make file to enable changes to be placed into a program listing. The make file includes commands which perform as little work as possible, i.e., only converting the new changes to object code. The make utility then links the old compiled code with just the overwritten new object code and avoids the necessity of having to recompile the entire code listing.
It is important that any new compiler be compatible with the make utility. Further, it is important that any new compile procedure be able to run at a reasonable speed, given the limited levels of memory available on personal computer and work station-style processors.
Accordingly, there is a need for an improved compiler which enables cross-CU optimization and is compatible with the make utility. Further, there is a need for an improved compiler which enables cross-CU optimization, while keeping compile time short and minimizing the amounts of required memory for execution of the optimization procedure.