A compiler is a computer program that transforms source code written in a high level programming language into another lower level language, typically assembly language or machine code, to ultimately create an executable program. When transforming the high level source code the compiler determines the syntactical correctness of the source code, produces an efficient run-time organization of the object code, and formats the output according to the requirements of an assembler or linker.
Compilation of a software program typically involves having each of its source files, or objects in source code format, individually compiled by a compiler into processor-executable native or machine code files. These compiled source files are then processed by a linker, which combines the compiled files to produce a complete executable program. A compiler may perform many or all of the following operations: lexical analysis, preprocessing, parsing, semantic analysis, code generation, and code optimization.
Code optimization is the process of tuning the output of the compiler to minimize or maximize some attributes of an executable computer program. The most common requirements are to minimize the time taken to execute a program; or to minimize the amount of memory occupied. Compiler optimization is generally implemented using a sequence of optimizing transformations, algorithms which take a program and transform it to produce a semantically equivalent output program that uses fewer resources.
One prior solution to optimize the executable program is to have the compiler perform all the optimizing transformations only on the object file it compiles. Known optimization operations performed by a compiler typically include base binding, function cloning, and partial evaluation.
However, one problem of the above-described prior solution is that the compilers do not have the knowledge of the entire program during compilation because object files are compiled separately. This means that many optimizing transformation operations at compile-time depend on information that can only be available when linking the object files. Therefore, even though a compiler is capable of performing an optimizing transformation operation based on a particular piece of information, the compiler is unable to exploit the global characteristics of the program, such as, the distribution of operation codes or the frequency of instruction sequences.
Link-time optimization is a type of prior-art program optimization solution performed on a program at link time when the global characteristics of the program are known. As the linker is in the process of merging the object files into a single file, or immediately thereafter, link-time optimization capabilities apply various forms of optimization on the newly merged file. Link time optimization may also involve the recompiling of the complete program, however this is computationally expensive.
A further prior-art optimization solution is interprocedural optimization (IPO) which also analyzes the entire program. Interprocedural optimization attempts to reduce or eliminate duplicate calculations, improve memory usage and simplify iterative processes by using typical optimization techniques such as procedure inlining, interprocedural dead code elimination, interprocedural constant propagation, and procedure reordering. The IPO process can occur at any step in the compilation sequence and can form part of the link-time optimization.
These existing solutions provide for either, quick compilation and linking time without aggressive global optimisation, or aggressive global optimisation at the expense of significant compile-time or link-time overhead.
The invention described herein provides techniques for offline static analysis of a stable code base such that global code base specific knowledge can be applied earlier in the compilation process to improve optimization. The offline static analysis produces specialized compiler components that are then used to rebuild the compiler. This results, over time, in a compiler specialized to the code base with improved optimization after every compiler release cycle while maintaining efficient compilation time. It is assumed that the process of building and releasing the compiler for use will happen regularly as a matter of course, and so there will be regular opportunity for updating codebase-specific knowledge in the compiler if the codebase should change and evolve over time.