The process of generating a machine readable version of a software application program involves several steps. After an initial design phase, source code is written in a selected programming language which embodies the logic and processing of the application program. Typically, for anything other than the simplest application programs, the source code is organized into separate routines. Each routine embodies one or more related functions in the application. The routines are stored in various source code files. For example, there is usually a main source code file which corresponds to the main processing logic of the application. The main source code file can incorporate other routines by cross-referencing or "importing" them from one or more ancillary source code files. The ancillary source code files can also import routines by cross-referencing each other or by referencing system libraries which contain commonly used routines.
After the coding process is complete, a compiler translates (i.e., compiles) the source code files into executable images. To compile a source code file, the compiler converts the human readable source code into machine readable instructions or so-called object code. The object code can than be directly executed on a central processing unit (e.g., a microprocessor) in a computer system. Machine language instructions in the object code are organized in a precise manner to carry out the logical processing steps of the application, as expressed by the original source code.
Typically, an application is composed of a main executable image (i.e., the main program) and a group of ancillary executable images that can be invoked, if needed, during execution of the main image. In the Windows NT operating system produced by Microsoft Corporation of Redmond, Wash., the ancillary images are called "dynamically linked libraries" (DLLs). After a program has been compiled into one or more images, the main image is executed, during which time the routines in the ancillary images are executed as needed for their respective functions.
It is in general though to be desirable for the compiler to place as few constraints as possible on the programmer, so that the programmer can be as creative as needed in designing the application. Most compilers therefore use a general set of rules for translating source code to object code which allows them to compile any program in that language. Unfortunately, the general rules do not produce the best performing object code. The resulting programs are therefore said to be "sub-optimal", in the sense that they are not necessarily the most efficient with respect to execution speed, system resource utilization, or other performance criteria.
Another reason that most compilers produce sub-optimal code is that the compiler itself incorporates very little knowledge of how the images perform when executed. In some instances, images may, for example, use more memory than needed, or may perform certain instructions unnecessarily or too repetitively. To produce better object code, some compilers use profile feedback information that includes how the images perform when executed. This information allows the compiler to perform profile feedback optimization.
Profile feedback optimization is also used by various tools to assist the programmer in optimizing the object code. One such tool is "Etch", a system developed by researchers at the University of Washington. In Etch, an instrumentation program analyzes the object code of every executable image, and inserts additional instrumentation object code into it. Typically, the instrumentation code is used to gather data about how the image executes during runtime, what branches are taken and how often, and so forth. For example, during instrumentation of an image, instrumentation code may be inserted into a specific branch point in the image object code to determine how many times that portion of the program executes. There are many other possible purposes for the instrumentation code as well. The reader should consult a paper entitled "Instrumentation and Optimization of Win32/Intel Executables using Etch", published in 1997 [on the Wold Wide Web at http://www.cs.washington.edu/homes/bershad/etch] by the University of Washington, for further details of Etch.
After the instrumentation process is complete, the instrumented version of the application is then executed on the computer. Execution of instrumented images must take place within a special shell program, which serves two purposes. First, the shell program restricts the operation of certain function calls that can be made by the instrumented image. For example, a function call might attempt to pass control to a non-instrumented program. By restricting the use of these types of function calls, the shell program ensures that only instrumented images are executed to avoid complicating the optimization process. The second use of the shell program is to gather so-called profile data produced as a result of the additional instrumentation code. The profile data indicates which functions in which images were executed and other performance information.
After a single execution of an instrumented application, a profile data file will exist for each instrumented image that executed a function containing instrumentation code. That is, each time an instrumented application is "run" in the shell program, any instrumented functions that are "exercised" will output profile data to a new profile data file. Each of these profile data files must then be saved by the user for use during optimization.
The instrumented version of an application is typically executed numerous times in order to "exercise" all of its functions. After numerous executions of an instrumented application, there will be many profile data files that have been saved. By executing an instrumented application in many different ways (i.e., exercising various features in various combinations), the profile data files will contain information that provides an indication of performance bottlenecks in the application.
Once profile data has been gathered, another program called an optimizer analyzes the profile data and converts the original images into an optimized version of the application based on the analysis. To perform the optimization process, a user typically specifies which original images are to be optimized in conjunction with the various profile data files. The optimizer analyzes the selected profile data files and modifies the selected original executable images based on this profile analysis. The original image object code, for example, might be re-written for better memory efficiency. In this manner, the original application is optimized based upon the profile feedback obtained from running the instrumented version of the application.