1. Technical Field
The present invention relates to an improved data processing system and, in particular, to a method and system for optimizing performance in a data processing system. Still more particularly, the present invention provides a method and system for a software program development tool for detecting and counting common bytecode sequences in a set of compilable bytecodes.
2. Description of Related Art
Because Java is an interpreted language, any programs written in Java, after being converted into Java class files, are interpreted by a Java virtual machine (JVM). In order to improve performance, many JVMs may compile Java classes into platform-specific binary code after they are loaded into the JVM. Then, instead of being interpreted, Java classes are executed in their compiled native code format, similar to programs written in other languages, such as C, C++, etc. Such just-in-time (JIT) compilation of Java programs can significantly improve the speed of execution of Java programs.
The just-in-time compilation time becomes part of the execution time of a Java program. For a given Java class method, JIT compilation can be justified only if the compiled method code executes in less time than the JIT compilation time for the method. Otherwise, the method should be executed by interpreting the method's bytecodes. For typical Java applications, there are many class methods which are only rarely invoked, making JIT compilation of such methods unjustified.
In advanced JVM implementations, JIT compilers compile Java methods selectively, depending upon the satisfaction of certain criteria. This so-called “hot-spot compiling” is a hybrid of interpretation and just-in-time compilation that attempts to combine both techniques in order to yield Java programs that run as fast as natively compiled code. This type of execution may be performed by an interpreter in the execution engine called a “mixed mode interpreter.” A mixed-mode interpreter attempts to analyze or profile the program in order to determine the locations of the program that justify the time expense for compiling a portion of the program.
The usual approach to optimization is to profile the program in a temporal dimension to discover exactly where the program spends most of its time and then spend time optimizing portions of the program which execute most often. In this approach, the JVM begins the execution of the program by interpreting the program. As the JVM interprets the program's bytecodes, it analyzes the execution of the program to determine the program's “hot spots,” which is the part of the program where the program spends most of its time. When it identifies a hot spot, the JVM just-in-time compiles only the portion of the code that encompasses the hot spot. Rather than, or in addition to, profiling the program in a temporal dimension, the program may also be analyzed in a spatial dimension to discover the bytecodes sequences which constitute the program. The optimization effort may then be directed to the most common bytecode sequences that appear within the program. However, analyzing a program to find common bytecode sequences is especially time intensive.
Designers of JIT compilers must trade off the time spent optimizing a bytecode sequence against the runtime gain from making that sequence faster. If analysis of many different Java programs reveals that certain bytecode sequences are frequently executed in many different programs, then the JIT compiler designers and implementers can program the JIT compiler to always apply their best optimizations to these sequences without any runtime profiling. However, a moderate-sized Java program may executed millions of different bytecode sequences, and there are many programs to analyze.
Therefore, it would be particularly advantageous to provide a high performance method and system for detecting and counting bytecode sequences.