1. Field of the Invention
The present invention relates, in general, to compilers, and, more particularly, to a code generation technique for optimizing execution of programs represented as bytecodes.
2. Relevant Background
Bytecode programming languages such as the Java(trademark) programming language (a trademark of Sun Microsystems, Inc.) represent computer programs as a set of bytecodes. Each bytecode is a numeric machine code for a xe2x80x9cvirtual machinexe2x80x9d that exists only in software on a client computer. The virtual machine is essentially an interpreter that understands the bytecodes, interprets the bytecodes into machine code, then runs the machine code on a native machine.
Bytecode programming languages such as the Java programming language are gaining popularity among software application developers because they are easy to use and highly portable. Programs represented as bytecodes can be readily ported to any computer that has a virtual machine capable of properly interpreting and translating the bytecodes. However, because bytecode programming languages must be interpreted at run time on the client computer, they have suffered from an inability to execute at speed competitive with traditional compiled languages such as C or C++.
The speed limitations of bytecode languages are primarily related to the compilation process. Compilation is the process by which programs authored in high level language (i.e., human readable) are translated into machine readable code. There are four basic steps to the compilation process: tokenizing, parsing, code generation, and optimization. In traditional compiled programs all of these steps are completed prior to run time, whereas in interpreted languages such as BASIC, all of the compilation steps are performed at run time on an instruction-by-instruction basis. Command shells like CSH are also examples of interpreters that recognize a limited number of commands. Interpreted languages result in inefficiency because there is no way to optimize the resulting code.
In a bytecode programming language tokenizing and parsing occur prior to run time. After parsing the program is translated into bytecodes that can be interpreted by a virtual machine. As a result, a bytecode interpreter is faster than a language interpreter such as in some of the original BASIC programming language implementations. Also, the resulting programs when represented in bytecode format are more compact than a fully compiled program. These features make bytecode languages a useful compromise in networked computer environments where software is transferred from one machine for execution on another machine.
In the Java programming environment, the program that performs the translation to bytecodes is called xe2x80x9cjavacxe2x80x9d and is sometimes referred to as a Java compiler (although it performs only part of the compilation process described above). The program that interprets the bytecodes on the client computer is called a Java virtual machine (JVM). Like other interpreters, the JVM runs in a loop executing each bytecode it receives. However, there is still a time consuming translation step on the virtual machine as the bytecodes are interpreted. For each bytecode, the interpreter identifies the corresponding series of machine instructions and then executes them. The overhead involved in any single translation is trivial, but overhead accumulates for every instruction that executes. In a large program, the overhead becomes significant compared to simply executing a series of fully-compiled instructions. As a result, large applications written in the Java programming language tend to be slower than the equivalent application in a fully compiled form.
To speed up execution, virtual machines have been coupled with or include a just-in-time compiler or JIT. The JIT improves run-time performance of bytecode interpreters by compiling the bytecodes into native machine code before executing them. The JIT translates a series of bytecodes into machine instructions the first time it sees them, and then executes the machine instructions instead of interpreting the bytecodes on subsequent invocations. The machine instructions are not saved anywhere except in memory, hence, the next time the program runs the JIT compilation process begins anew.
The result is that the bytecodes are still portable and in many cases run much faster than they would in a normal interpreter. Just-in-time compiling is particularly useful when code segments are executed repeatedly as in many computational programs. Just-in-time compiling offers little performance improvement and may actually slow performance for small code sections that are executed once or a few times and then not reused.
One limitation of JIT technology is that the compiling takes place at run-time and so any computation time spent in trying to optimize the machine code is overhead that may slow execution of the program. Hence, many optimization techniques are not practical in prior JIT compilers. Also, a JIT compiler does not see a large quantity of code at one time and so cannot optimize over a large quantity of code. One result of this is that the compiler cannot determine with certainty the set of classes used by a program. Moreover, the set of classes can change each time a given program is executed so the virtual machine compiler can never assume that the set of classes is ever unambiguously known.
Because of this uncertainty, it is difficult to produce truly optimal native code at run-time. Attempts to optimize with incomplete knowledge of the class set can be inefficient or may alter program functionality. Uncertainty about the class set gives rise to a large area of potential inefficiency in Java execution. The class set is a set of all classes that will be used to execute a particular instance of a Java language program. In a traditionally batch-compiled program, the entire universe of classes that will be used by the program is known at compile time greatly easing the optimization task.
A common program occurrence is that a first method (the caller method) calls a second method (the target method) in a process referred to as a xe2x80x9cmethod callxe2x80x9d. This method call process is advantageous from a programmers perspective, but consumes many clock cycles. An important optimization in the compilation of object oriented program code is called xe2x80x9cinliningxe2x80x9d. In fully compiled programs, inlining makes these target method calls more efficient by copying the code of the target method into the calling method.
However, because of the semantics of the Java platform the compiler cannot ever determine the entire class set. Because classes are extensible and dynamically loadable in Java, methods can be overridden by subsequent extensions to a class, the compiler cannot know with certainty that a virtual method call will reach any particular method. In the Java programming language, a method by default can be overridden unless it is expressly defined as xe2x80x9cfinalxe2x80x9d. All xe2x80x9cnon-finalxe2x80x9d methods are assumed to be overrideable.
Hence, all calls to non-final leaf methods must assume that the target method may be overridden at some time in the future. Hence, only time consuming virtual method call sequences have been used to invoke them in the past. The smaller the method, the more advantage inlining gives. A one-line method would typically spend far more time entering and exiting the routine as it does executing its contents. Tests have revealed that often as much as 85% of these non-final leaf method calls could be resolved if it were possible to know with certainty that no further classes would overload the methods.
What is needed is a method and apparatus for producing more optimal native code in an environment where the set of classes for a program is not unambiguously known. A need also exists for method and apparatus that deals with cases when knowledge of the class set is incomplete with tolerable impact on program execution performance.
Briefly stated, the present invention involves a method, system and apparatus for generating and optimizing native code in a runtime compiler from a group of bytecodes presented to the compiler. The compiler accesses information that indicates a likelihood that a class will be a particular type when accessed by the running program. Using the accessed information, the compiler selects a code generation method from a plurality of code generation methods. A code generator generates optimized native code according to the selected code generation method and stores the optimized native code in a code cache for reuse.