1. Field of the Invention
The present invention relates to computer programming, and deals more particularly with techniques for improving how compilers generate code involving data that is not constant, but is unlikely to change except in relatively infrequent situations.
2. Description of the Related Art
When a compiler processes programming language code in an application that tests a variable and then performs different instructions, based on the variable setting, the compiler generates corresponding assembly-language code comprising a number of assembly-language instructions. Typically, this assembly-language code first accesses a memory or storage location to retrieve the variable's current value, and may then load that value into a register. Additional access and load operations may be required if the compared-to value is also variable. The compiler also generates assembly language code to perform the comparison, as well as branching code for transferring the execution path to a different instruction, based on the result of performing the comparison.
A compiler may alter or manipulate the assembly language code it generates in an attempt to optimize aspects of the application, such as its run-time performance. Such compilers are referred to as optimizing compilers, and are known in the art. One example is the Java™ Just-In-Time (“JIT”) compiler. (“Java” is a trademark of Sun Microsystems, Inc.)
Modern processors typically include branch-prediction hardware that attempts to improve run-time performance, as is well known in the art. One type of optimization that may be performed by an optimizing compiler pertains to optimizing run-time performance of the variable comparison and branching scenario discussed above. During execution of an application, modern processors typically track the branches in the actively-executing code and use this information in the branch-prediction hardware (which is critical to optimal performance of these modern processors). Processing time can be shortened if the branch-prediction hardware correctly predicts whether or not the branch will be taken and then loads the corresponding instructions and values that will be executed next.
Run-time performance problems may arise when an application contains a number of comparisons and corresponding branching instructions. For each instance thereof, the length of the run-time path (as well as the size of the compiled image) increases due to the compiler-generated assembly language code for accessing, loading, and comparing the variable(s) and for carrying out the branching.
If an application contains a high number of compare-and-branch operations, the processor may be unable to track all the branches in the currently-active executing code, and the branch prediction hardware can be fooled into mispredicting the branches that will be encountered, thus leading to run-time performance degradation.
A common solution to this problem is to use profiling information in a just-in-time compiler to programmatically reorder the blocks of assembly language code, based on a programmatic prediction of which branch is more likely to be taken in response to a comparison operation, so that the processor's default prediction when it sees a branch is likely to be correct. When this approach succeeds, the cost of the mispredictions is avoided. However, this approach is highly dependent on the quality of the profiling information collected. At some levels of compilation, unfortunately, profiling information may not be available; as a result, the programmatic reordering cannot be performed except by using static heuristics which are less effective.
There are some situations where compare-and-branch logic is provided in an application to be compiled, yet a vast majority of run-time behavior exercises a single one of the branches. An example of this is debugging or tracing logic that is provided in an application (or portions thereof). An application developer might include instrumentation in various methods to trace when those methods have been entered and exited, for example, and perhaps to trace values of input parameters thereto. Typically, this instrumentation has semantics of the form “If tracing is enabled, then write information to a file”. Accordingly, these instrumented methods will cause the compiler to generate compare-and-branch code. Performance penalties that may result when executing this instrumented code (such as increased path length) have been discussed above. It is unlikely, during any performance-critical run, that tracing will be enabled. Thus, the variable dictating whether tracing is enabled will most often be set to false (or a similar value corresponding to “don't trace”). It is possible, however, that the user could turn on tracing at any time. Because the value could change, the just-in-time compiler is unable to assume that the initial value will remain constant. As a result, even though the compiler might successfully use profiling information and reorder blocks of assembly language code to match the application's actual run-time behavior, inefficiencies remain because the instructions to access and load the variable, compare it to false (or true), and branch on the result must still be executed.
Accordingly, what is needed are techniques that avoid problems of the type described above.