Virtual machine (VM) environments are abstract computer environments that allow for portability of software between different underlying computer architectures. The VM is itself a complex software product that is implemented upon a particular computer hardware platform and/or operating system. The VM then provides a uniform layer of abstraction between the hardware platform and any compiled software applications that will run thereon. Virtual machines are essential for the portability of certain technologies, including Java programs. The Java Virtual Machine (JVM) allows compiled Java programs to be run on the virtual machine or JVM, independently of whatever underlying hardware or operating system is used. Examples of currently available JVM products include the Sun Java Virtual Machine from Sun Microsystems, Inc., and the JRockit Virtual Machine from BEA Systems, Inc.
A real CPU understands and executes instructions that are native to that CPU (commonly called native code). In comparison, a virtual machine understands and executes virtual machine instructions (commonly called bytecode). A virtual machine almost always run on a real CPU executing native code. The core of a virtual machine is normally implemented in a language such as C, that is then always compiled to native code using an OS/CPU compatible compiler.
A virtual machine can implement different strategies of how to execute the byte codes. If the virtual machine analyzes each bytecode separately and does this every time the same bytecode is executed, then the virtual machine is said to be an interpreter. If instead the virtual machine translates the bytecode into native code once, and then the native code is used every time the same bytecode is executed, then the virtual machine is said to be a just-in-time compiler (commonly called a JIT).
Some virtual machines contain both an interpreter and a JIT. In the case of Java Virtual Machines, the Sun Java Virtual Machine will initially use the interpreter when executing Java bytecode. When the Sun JVM subsequently detects bytecode that is executed often (commonly called a hot spot in the program) it will compile that part of the bytecode into native code. In contrast, the JRockit Virtual Machine will never interpret the Java bytecode. Instead, the JRockit JVM will always compile it to native code before executing it. If JRockit detects a hot spot in the program it will recompile that part of the bytecode again, but with more code optimizations. Such compiler techniques are described in the books “Advanced Compiler Design and Implementation” by Steven S. Muchnik; “Crafting a Compiler with C” by Charles N. Fischer and Richard J. LeBlanc, Jr.; and “Compilers” by Alfred V. Aho, Ravi Sethi, Jeffrey D. Ullman, each of which are incorporated herein by reference.
Java bytecode is not immediately usable as a high-level representation of the compiled application since the Java bytecode was not originally designed for this purpose. However, it is possible to transform the Java bytecode into a high-level intermediate representation (HIR) of the application suitable for a compiler because the Java bytecode is not as low-level as native machine code, and because most Java byte codes are generated with the same compiler (the javac compile from Sun Microsystems). Unfortunately, bytecode obfuscators are sometimes used, which makes it both difficult to automatically extract a proper HIR from the Java bytecode, and makes the compiled code less efficient.
The HIR contains trees with expressions that in turn contain subexpressions and which are evaluated recursively. Optimizations can be applied to the HIR, for example the use of pattern matching to detect common compiler generated idioms and to reduce these into simpler constructs. Standard compiler techniques then transform the HIR into a medium-level intermediate representation (MIR). Unlike the HIR, the MIR cannot contain expressions within expressions. The HIR to MIR transform flattens the trees and inserts variables for storage of the results of evaluated sub-expressions. Most optimizations are performed on the MIR. For example the MIR can be transformed into SSA (Single Static Assignment) form where a variable is only assigned once, and as a result the number of variables increase drastically. However many optimizations are easy to perform on SSA-MIR.
Finally, the MIR is transformed into a platform dependent low-level intermediate (LIR) representation where the limitations of the target CPU affects the opcodes. Since the compiler source code for optimizations performed on the HIR and the MIR is reused for all platforms, implementors delay the transformation into platform dependent code as long as possible for maximum source code reuse. When the compiler has reached the LIR all further optimizations are supposed to be tailored for each platform, if there are optimizations that are similar between platforms this will lead to source code duplication and less effective source code development and maintenance. This abstraction barrier between platform independent and platform independent code has been beneficial for traditional compiler design that has focused on C-compiles (and similar languages). However when compiling virtual opcodes for a JVM to different architectures it turns out that the watertight abstraction barrier can be a problem.
The following example is a MIR representation of a typical 64-bit OR bit operation with a variable and a constant:
OR x, 0x000000010000000L→z
The above operation first ORs the variable x with the large constant, and then stores the result in variable z. The MIR optimizer can detect obvious cases where the constant is zero and remove the operation altogether, but this is not the case here. However if one assumes that the system is operating on a platform which only supports 32-bit registers and operations, then the transformation from MIR to LIR will split the OR into two 32-bit OR operations, and the two variables x and z will be split into four variables x_hi, x_lo, z_hi and z_lo. The constant will also need to be split. One might also be using a platform compatible with the Intel x86 CPU which requires that the destination is the same as the source. This will also introduce the need for temporary variables, for example tmp and tmp2 variables:
1MOV x_hi -> tmp2OR tmp, 0x00000001 -> tmp3MOV tmp -> z_hi4MOV x_lo -> tmp25OR tmp2, 0x00000000 -> tmp26MOV tmp2 -> z_lo
5 OR tmp2, 0x00000000→tmp2
6 MOV tmp2→z_lo
In the above example, because step 5 involves a zero constant, the step is redundant and can be removed by the same kind of optimization normally performed on the MIR. This type of optimization is referred to as a strength reduction optimization.
The next step in the optimization process would be merge steps 4 and 6 into a single MOV x_lo→z_lo. This type of optimization is referred to as a copy propagation optimization.
A traditional compiler design must either reimplement the strength reduction and the copy propagation optimization in the platform dependent layer for each supported platform, (which involves unnecessary code duplication), or else ignore the strength reduction and copy propagation in the LIR (with the result being reduced efficiency).
In some instances it might be possible to create code that can work on any platform dependent LIR. However, this is just a workaround to the fundamental problem. Generating platform dependent code introduces new constructs that are suitable for higher-level optimizations. This has not been a large problem for traditional C-compilers since their fundamental variables are adapted to fit the platform for which they are compiling. For example the “int” is 32-bit on a 32-bit platform and 64-bit on a 64-bit platform. However, the Java bytecode is dependent on the JVM platform and the bit sizes are locked. As such it would be beneficial if bytecode compilers could consider platform dependencies earlier in the compilation process to better address these issues.