This disclosure relates generally to the field of computer software. More particularly, but not by way of limitation, it relates to a technique for injecting code into an application to be run prior to execution of the application.
Software products that monitor software applications at runtime often need to execute code before the actual application starts to execute. The primary reason is enabling instrumentation of the application's code, but other pre-application processes may also be required. A technique that enables running code prior to software application code is often referred to as “hooking” or “placing a hook.”
JAVA® software applications (JAVA is a registered trademark of Oracle America, Inc.) are required by the JAVA Virtual Machine (JVM) specifications to provide an entry point, which is a class that declares and implements the method “public static void main(String args[ ]).” This class is typically known as the “Main Class.” Its fully qualified name is provided to the JVM executable (e.g., java.exe) as a command line argument. Consequently, the JVM, which provides an object-oriented environment for JAVA programs, looks up a class file that contains the Main Class's bytecode, loads it and runs its “main( )” method, thus starting the application. Normally, the Main Class class-file will be located in the “application classpath”—a JVM variable that specifies locations on the disk where user-defined resources are located.
One hooking technology that has been used is placing a new “Imposter Main Class” with the same fully qualified name as the actual application's Main class (referred here as the “Original Main Class”) in one of the directories specified for the classpath. As a result, imposter Main class will execute first, delegating execution to the original Main class only after running its own hook code.
Class loading is the process of obtaining class bytes and transforming them into a “viable” entity in the JVM. This process is done by subclasses of java.lang.ClassLoader. According to what is known as the “class loading delegation model,” a ClassLoader in the JVM will first ask its parent ClassLoader (if such exists) to load the requested class before trying to load itself. This procedure is recursive, which means that a class will be loaded in the highest possible point in the ClassLoader hierarchy.
“AppClassLoader” is the JVM ClassLoader “familiar” with the application classpath. Accordingly, it is the one requested to load the Main class. In compliance with the class loading delegation model, it delegates the request to its parent—“ExtClassLoader.” The latter is another JVM-internal ClassLoader of which classpath is composed of all Java ARchive (jar) files located in all directories specified by the JVM argument “java.ext.dirs.” Only after ExtClassLoader fails to load the Main class (since it is, normally, not part of its classpath) does the AppClassLoader loads the Main class itself.
There are problems with the “naïve” Main class replacement described above. These problems include (a) problems with loading the original Main class, (b) problems with classes accessed by the original Main class, (c) problems accessing members (methods and fields) of the original Main class, and (d) problems with different Main classes in different processes.
Due to the delegation model described above, the original Main class cannot be loaded in a conventional way (i.e., by invocation of any loadClass( ) overload), because that will return the formerly loaded Imposter Main class. Instead, the original Main class's bytes are typically obtained and the class is explicitly defined via a defineClass( ) method. Since the process of loading the first Main (i.e., the Imposter Main class) is initiated using AppClassLoader, AppClassLoader is recorded in the JVM as the initiating loader of the Main class, although the defining loader is actually ExtClassLoader. Trying to use AppClassLoader for defining the original Main class typically causes a problem in the JVM during the linkage process and a LinkageError is thrown as a result. That is why the Original Main is typically loaded using a new ClassLoader especially allocated for that purpose. That causes another problem, as the original ClassLoader hierarchy is completely altered.
All classes and class-members with package-private access that are accessed by the Original Main class (e.g., the Original Main class's inner classes) must be loaded by the same ClassLoader that loaded the Original Main class. Otherwise, the two classes will not be seen in the same runtime package (which is defined by the package name and the defining loader) causing an IllegalAccessError. This means that all of those classes and class-members must be known in advance.
All Original Main class members (methods and fields), even public ones, are not accessible to the rest of the application—any request for the Original Main class will yield the Imposter Main class (again, due to the delegation model). This can be worked around for methods by holding a reference to the Original Main class in the Imposter Main class, overloading required methods and delegating execution to the Original Main class. However, this means that the imposter Main class needs to be familiar with the Original Main class. There is no generic solution for the Original Main class's fields.
In many cases, different Main classes can be used in different processes running different parts of the same application's code. If a process invoked by a Main class that is not overridden by an Imposter Main class uses a Main class that is overridden by an Imposter Main class, the process typically receives the Imposter Main class instead of the Original Main class.
In addition to the class replacement technique described above, other techniques have been developed for injecting code into an application to run prior to the original Main class.
In one conventional technique, the command line is changed to run a different Main class. The new Main class then executes hook code and then runs the “real” application's Main class. In this technique, instead of transferring the Main class name to the executable as a JVM parameter, it is transferred as a program parameter to the Main class that performs the hook.
In another conventional technique, a JAVA Agent is employed to run code prior to any class loading. This JVM inherent hooking mechanism is not available in all versions of the JAVA development kit (JDK), and requires modifying the command line that runs the JAVA executable.
Some JVMs expose a JVM Profiler Interface (JVMPI), intended for vendors to develop profilers that work in conjunction with that JVM. Using this application-programming interface (API) requires specifying the name of the profiler agent and the options to the profiler agent through a command line option to the JVM. Using this API opens a window in which to execute code, not necessarily written in the JAVA language, prior to application execution.
These techniques require editing the command line for the application, which can be a very limiting restriction for products that are designed for use in unknown and extremely variable environments. For example, in order to place the hook, one needs to know where all scripts are that run the application and edit them correctly. The hook will not be active when running any script that was formed after the hook was installed. In production environments, changing the command line may require privileges that may be highly restricted and unavailable for normal users. In addition, modifying the command line may be impossible to perform when the JAVA environment is invoked by binary code such as an .exe file. Finally, a user may choose to run the application explicitly from the command line, avoiding the hook usage.
Thus, a better technique for hooking application to provide code to run prior to execution of the Main class has been desired.