Processes to be performed by computer systems are typically defined in high level source language programs. Compilers, linkers, and assemblers may be used to convert the source language programs to machine executable code. The utilization of computer resources can be minimized if the machine code is optimized for execution.
Optimizing the machine executable code has conventionally been the domain of the compiler. This is sensible; it is the compiler that generates machine compatible object code. Furthermore, the compiler begins with the original source program written in the high level language by the programmer. Therefore, the intent of the programmer is exposed to the compiler. The source code typically includes directions about the type of variables declared and the use of high level operations. The source code may also include pragma added by the programmer to streamline the conversion from source to executable code. Nonetheless, most conventional compilers include limitations which produce non-optimal machine code. Non-optimal code degrades performance of the program during execution and wastes valuable computer system resources such as instruction cycles and storage space in memory.
In most environments, a large program is written as many smaller source code modules using an editor. Smaller source modules are easier to understand and manage. The source modules are then compiled into object modules, typically, one source module at the time. This separate compilation would be difficult to eliminate, since it greatly reduces the turn-around time in the multiple edit-compile-build-run-debug cycles that are so characteristic of computer program development.
However, since the compiler only operates on one module at the time, optimization is limited to the module being compiled. Thus, the compiler cannot view the behavior of the entire program to eliminate, perhaps, duplicate procedures in different source modules. Also, the linker may acquire object or assembly code from standard libraries. Clearly, traditional compile-time optimization can not optimize the pre-compiled library modules.
In addition, the compiler can not determine the actual use, or for that matter non-use, of variables declared in one module, for use in another module. It is also difficult for the compiler to determine if memory and registers will be fully utilized during execution.
Moreover, most compilers convert source code modules in distinct phases. For example, retargetable compilers, for generating machine executable code for different computer hardware architectures, may have machine-independent phases insulated from machine-dependent phases. The optimization methods may therefore not have a complete view of the resources that will be available during execution.
In many computer environments, the compiler produces assembly code rather than true object code. The advent of reduced instruction set computing (RISC) architectures has led to environments in which the assembly code is no longer isomorphic with the object code, and optimization below the level of assembly code is not possible by the compiler.
There are some systems which attempt post-compiler or "late" code modifications. The known modifications typically are limited to patching linked code for instrumentation. Patching is the insertion of specific instructions at known locations. "Instrumentation" refers to the patching of the program for the purpose of measuring and tracing the flow of the program while executing. During instrumentations, the memory addresses of relocated instructions are carefully tracked, and references to any relocated memory locations are accordingly adjusted. However, the problems associated with late code modification have generally inhibited aggressive optimization of linked code.
There may be additional advantages in optimizing linked code. It is not until the object modules are linked together that the gross morphology of the entire program becomes visible. Although some of the high-level structure is missing from the linked object modules, link-time optimization may have other properties not available to compile-time optimization.
At link-time the entire program, including library modules is available. Thus, the entire program can be scrutinized for optimization opportunities, rather than isolated opportunities just within single and separately compiled source code modules. Also, at link time, the output of the compiler, e.g., the object code, can be examined. This may not have been done by the compiler. These link-time properties can lead to several advantages which are exploited by the invention to optimize the executable code to a level not previously achievable by compilers.