1. Field of the Invention
The present invention relates generally to an improved data processing system, and in particular to a compiler method for exploiting data value locality for computation reuse.
2. Description of the Related Art
Modern microprocessors and software compilers employ many techniques to help increase the speed with which software executes. Values produced by executing instructions have been shown to exhibit a high degree of value locality in various benchmarks, such as SPEC95 and SPEC2000. The Standard Performance Evaluation Corporation (SPEC) is a non-profit corporation formed to establish, maintain and endorse a standardized set of relevant benchmarks that can be applied to the newest generation of high-performance computers. Value locality describes the likelihood of the recurrence of the same value within a storage location. Modern processors already exploit value locality in a very restricted way, e.g., the use of control speculation for branch predication, hardware table lookup, load-value prediction to guess the result of a load so that the dependent instructions can immediately proceed without having to wait for the memory access to complete, etc. Value locality has been exploited in compilers for code specialization, where value profiling at run-time is typically used to identify a semi-invariant variable, and the code is specialized to perform optimizations including constant folding, partial evaluation and loop versioning.
Furthermore, value locality exposes the opportunity of computation reuse, i.e., result memorization based on the fact that the same inputs with same operations applied should generate the same results. For instance, software programs often include many instructions that are executed multiple times each time the program is executed, and these programs typically have logical “regions” of instructions, each of which may be executed many times. When a region is one that is executed more than once, and the results produced by the region are the same for more than one execution, the region is a candidate for “reuse.” The term “reuse” refers to the reusing of results from a previous execution of the region. For example, a computation reuse region could be a region of software instructions that, when executed, read a first set of registers and modify a second set of registers. The data values in the first set of registers are the “inputs” to the computation reuse region, and the data values deposited into the second set of registers are the “results” of the computation reuse region. A buffer holding inputs and results can be maintained for the region. Each entry in the buffer is termed an “instance.” When the region is encountered during execution of the program, the buffer is consulted, and if an instance with matching input values is found, the results can be used without having to execute the software instructions in the computation reuse region. When reusing the results is faster than executing the software instructions in the region, performance improves.
Additionally, some modern compilers can operate on a program while it is being executed. This type of compiler is referred to as a dynamic compiler, and computer programming languages that are designed to support such activity may be referred to as “dynamically compiled languages”.
Some modern compilers also use a technique known as profiling to improve the quality of code generated by the compiler. An example of a profiling technique is profile directed feedback (PDF). Profiling is usually performed by adding relevant instrumentation code to the program being compiled, and then executing that program to collect profiling data. Examples of profiling data include relative frequency of execution of one part of the program compared to others, values of expressions used in the program, and outcomes of conditional branches in the program. An optimizing compiler can use this data to perform code reordering, based on relative block execution frequencies, code specialization, based on value profiling, code block outlining, or other forms of optimization techniques that boost the final program's performance.
Traditional profile directed feedback optimizations require performing at least two separate steps: a compile instrumentation step with the representative training data to gather program behavior information (i.e., profile data), and a re-compile step to optimize the code based on the gathered profile data. This optimization approach has several limitations with usability, productivity, and adaptability. With existing profile directed feedback optimizations methods, multiple runs are needed to gather the profile data, the training data must be representative so that the program has similar behavior with real input data, and any input characteristic changes may have a negative performance impact.