Modern software programs include many instructions that are executed multiple times each time the program is executed. Typically, large programs have logical “regions” of instructions, each of which may be executed many times. When a region is one that is executed more than once, and the results produced by the region are the same for more than one execution, the region is a candidate for “reuse.” The term “reuse” refers to the reusing of results from a previous execution of the region.
For example, a computation reuse region could be a region of software instructions that, when executed, read a first set of registers and modify a second set of registers. The data values in the first set of registers are the “inputs” to the computation reuse region, and the data values deposited into the second set of registers are the “results” of the computation reuse region. A buffer holding inputs and results can be maintained for the region. Each entry in the buffer is termed an “instance.” When the region is encountered during execution of the program, the buffer is consulted and if an instance with matching input values is found, the results can be used without having to execute the software instructions in the computation reuse region. When reusing the results is faster than executing the software instructions in the region, performance improves. Such a buffer is described in: Daniel Connors & Wen-mei Hwu, “Compiler-Directed Dynamic Computation Reuse: Rationale and Initial Results,” Proceedings of the 32nd Annual International Symposium on Microarchitecture (MICRO), November 1999.
Some regions make better candidates for reuse than others. For example, a region capable of producing an often-reused instance is a good candidate for reuse. In contrast, regions that produce instances that are not reused often generally do not make good candidates for reuse, in part because new instances are frequently generated, and buffered instances are not often reused. Regions that are candidates for reuse are typically identified when the program is compiled. The compiler identifies candidates for reuse, and selects which candidates are to be computation reuse regions after the program is compiled. This can be a difficult problem, in part because the compiler does not necessarily have information describing whether candidate regions have the qualities that make for good reuse regions.
Some compilers use value profiling algorithms in an attempt to identify variables with invariant behavior. One such value profiling algorithm is discussed in: Brad Calder, Peter Feller & Alan Eustace, “Value Profiling,” Proceedings of the 30th Annual International Symposium on Microarchitecture (MICRO), December 1997. Calder et al. present a technique that attempts to identify variables with invariant behavior by observing each variable accessed by instructions. Calder et al. also present a technique that observes each variable for a period of time and then tests for convergence. This approach can incur significant overhead, in part because every value generated by every instruction is profiled. Value profiling as described by Calder et al. is not directly applicable to the identification of reuse regions, in part because regions often have inputs and outputs that include multiple variables.
For the reasons stated above, and for other reasons stated below which will become apparent to those skilled in the art upon reading and understanding the present specification, there is a need in the art for an alternate method and apparatus for identifying and profiling candidate reuse regions.