The increasing complexity of software programs has led to the development of a variety of tools to aid programmers and administrators in understanding the structure and functionality of their programs. Examples of these program analysis tools include debuggers, runtime execution visualizers, development environments and software quality tools. A debugger is a program that interactively traces the logic flow of a program and the contents of memory elements to locate, analyze, and correct bugs in another computer program. Runtime execution tools like profilers use processes like sampling or direct instrumentation to obtain a variety of runtime information, such as heavy memory allocation sites, CPU usage hot-spots, unnecessary object retention, and monitor contention, for a comprehensive performance analysis. A typical integrated development environment (IDE) includes a software application that provides software components to quickly prototype a new application.
A particularly useful category of runtime execution tools are dynamic instrumentation tools. Dynamic instrumentation is a technique that instruments the running application with probes to collect runtime information. It is an efficient and flexible technique, and does not depend on a priori knowledge on where to instrument or with what probes. It has been used widely by various debugging and performance tools for applications written in compilation languages such as C and C++. FIG. 2 illustrates dynamic instrumentation in such a system. The loaded binary code is instrumented with probes during runtime by the dynamic instrumenter. For simplicity of illustration, some of the details of the runtime environment are omitted, such as the additional library code loaded and executed along with the application code.
Unlike C or C++, platform independent languages like Java have compilers that generate classfiles, not binary code, from their Java source files. These classfiles are not directly executable by the system hardware, and need to be either interpreted or converted into binary code by the jit-compiler. FIG. 3A generally shows a system for operating a Java program, including the Java compiler 302, and the Java Runtime Environment (JRE) with the Java interpreter 312. During runtime, the Java class loader 310 loads classes from the classfiles 303, and the Java interpreter 312 interprets the bytecodes of the loaded classes 311. Again, for simplicity of illustration some details of the runtime are omitted, such as the Java Class Library (JCL) typically loaded and executed along with the application classfiles.
To improve the execution speed, JVM's employ just-in-time (JIT) compilation, which generates and executes binary code from bytecodes during the runtime. FIG. 3B shows the Java runtime environment with a JIT compiler 315. Along with the Java interpreter 312, the JIT compiler 315 compiles some of the loaded classes 311 into binary code 316, which is then executed by the hardware 317. Note that although the C/C++ compile-and-run structure in FIG. 2 and the JIT runtime structure in FIG. 3B look similar, there is a major difference. Since the compiler 202 in FIG. 2 runs before execution, any transformation by the compiler 202 is static, i.e., done before the runtime. The JIT-compiler 315 in FIG. 3B, however, runs during execution, and any transformations by the JIT-compiler 315 is dynamic, i.e., done during the runtime.
This program instrumentation is an example of a broader class of tools known as program transformation (PT) tools. Runtime program transformation tools such as dynamic instrumentation insert probes into an application in such a manner that during execution, each probe generates dynamic information on the execution state or an event of the application, and transforms the object code of the application dynamically while the application is running. Another example of program transformation is Aspect Oriented Programming, which restructures the program based on different aspects of the same application.
Runtime PT tools for virtual-machine (VM) applications, such as Java applications or .NET applications, transform their intermediate code, e.g., Java's bytecode or .NET's intermediate code (for simplicity of discussion the term “bytecode” is used below in connection with the preferred embodiment, but this should not be understood restrictively as applying to VM intermediate code such as Java's bytecode, but to all forms of intermediate code such as intermediate code used with .NET code.) The VM executes the (transformed) bytecode by interpreting it, or first converting it into object code and then executing the object code. In Java the unit of bytecode to be processed by VM for execution is a class file. By class loading, we follow the terminology of Java to mean the process of VM reading a unit of bytecode and readying it for execution.
However, bytecode transformation (BT) as a means of runtime program transformation has several critical problems. The first of these is the loss of adaptability of the transformation to the execution behavior that is constantly changing during an execution. To adapt to the changing execution behavior, runtime tools sometimes require multiple (re-)transformations of the application during execution. In Java, however, a loaded class becomes non-writable (i.e., read-only or immutable), which makes multiple re-transformations of bytecode during execution very costly (albeit not impossible). To carry out multiple re-transformations during execution generally involves a process called unloading and reloading the class; a less commonly used possibility is to use an interface of the VM called JVMDI. Both of these approaches incur very high runtime overhead. Because of the read-only requirement of a loaded class and the high runtime overhead of re-transformations, bytecode transformation tools typically transform a class only once statically before it is loaded (e.g., ShrikeBT) or dynamically while it is being loaded (e.g., JiTi), and disallow re-transformations of the bytecode. Further, this approach only allows for transformations that can be expressed in bytecode, and is inadequate for general instrumentations. For example, it does not allow for instrumentations for gathering runtime information on object identities, synchronization states, or garbage collection because probes for them are not expressible in bytecode.
The second problem is the expressiveness of the bytecode in collecting information on the states and events of the application during execution. bytecode transformation manipulates a class by restructuring, inserting, or deleting part of the bytecode of the class. The result of a correctly performed bytecode transformation is a class with transformed bytecode, whose execution behavior should be expressible by a program written in the source language. This property of bytecode transformation may be desirable for certain transformations, but may be too much of a restriction on runtime instrumentations. Bytecode transformation instrumentations can observe and collect information only on the states and events that can be expressed in the source language. For example in Java, they cannot directly collect information on states or events related to garbage collection or meta information on objects, which are not expressible in Java. They can indirectly collect such information with help from native code within and without the VM, but only through an interface called JNI, which incurs very high execution overhead. Furthermore, information that can be collected without JNI still needs additional execution of bytecode for the purpose of collecting, maintaining, and reporting the information. The additional execution can perturb the original execution of the application through libraries shared by the original application and the instrumentation bytecode. The perturbation by the instrumentation code can be large enough to render the collected information useless. Minimizing the perturbation would require using native code for the collection and report of the runtime information, which would require use of the expensive JNI.
The third problem is that some VM implementations depend on specific bytecode construction of some classes in the libraries provided along with the VM. For example, some Java VM implementations depend on specific bytecode construction of some Java class files in the Java runtime libraries. But some JVM implementations are known to crash when bytecode transformation is applied to these certain classes in the Java Class Library (JCL).
Thus, there is a need for a better way to dynamically transform intermediate code during runtime.