Programs written in the JAVA programming language (JAVA is a trademark of Sun Microsystems Inc) are generally run in a virtual machine environment, rather than directly on hardware. Thus a JAVA program is typically compiled into byte-code form, and then interpreted by a JAVA virtual machine (JVM) into hardware commands for the platform on which the JVM is executing. The JVM itself is an application running on the underlying operating system. An important advantage of this approach is that JAVA applications can run on a very wide range of platforms, providing of course that a JVM is available for each platform.
JAVA is an object-oriented language. Thus a JAVA program is formed from a set of class files having methods that represent sequences of instructions (somewhat akin to subroutines). A hierarchy of classes can be defined, with each class inheriting properties (including methods) from those classes which are above it in the hierarchy. For any given class in the hierarchy, its descendants (i.e. below it) are called subclasses, whilst its ancestors (i.e. above it) are called superclasses.
At run-time classes are loaded into the JVM by one or more class loaders, which themselves are organised into a hierarchy. Objects can then be created as instantiations of these class files. One JAVA object can call a method in another JAVA object. In recent years JAVA has become very popular, and is described in many books, for example “Exploring Java” by Niemeyer and Peck, O'Reilly & Associates, 1996, USA, and “The Java Virtual Machine Specification” by Lindholm and Yellin, Addison-Wedley, 1997, USA.
The standard JVM architecture is generally designed to run only a single application, although this can be multi-threaded. In a server environment used for database transactions and such-like, each transaction is typically performed as a separate application, rather than as different threads within an application. This is to ensure that every transaction starts with the JVM in a clean state. In other words, a new JVM is started for each transaction (i.e. for each new JAVA application). Unfortunately however this results in an initial delay in running the application (the reasons for this will be described in more detail later). The overhead due to this frequent starting and then stopping a JVM as successive transactions are processed is significant, and seriously degrades the scalability of JAVA server solutions.
Various attempts have been made to mitigate this problem. EP-962860-A describes a process whereby one JVM can fork into a parent and a child process, this being quicker than setting up a fresh JVM. The ability to run multiple processes in a JAVA-like system, thereby reducing overhead per application, is described in “Processes in KaffeOS: Isolation, Resource Management, and Sharing in Java” by G back, W Hsieh, and J Lepreau (see: http://www.cs.utah.edu/flux/papers/kaffeos-osdi00/main.html). Another approach is described in “Oracle JServer Scalability and Performance” by Jeremy Litzt, July 1999 (see: http:www.oracle.com/database/documents/jserver_scalability_and_performance_twp.pdf). The JServer product available from Oracle Corporation, USA, supports the concept of multiple sessions (a session effectively representing a transaction or application). Resources such as read-only bytecode information are shared between the various sessions, but each individual session appears to its client to be a dedicated conventional JVM.
U.S. patent application Ser. No. 09/304,160, filed 30 Apr. 1999 (“A long Running Reusable Extendible Virtual Machine”), assigned to IBM Corporation (IBM docket YOR9-1999-0170), discloses a virtual machine (VM) having two types of heap, a private heap and a shared heap. The former is intended primarily for storing application classes, whilst the latter is intended primarily for storing system classes and, as its name implies, is accessible to multiple VMs. A related idea is described in “Building a Java virtual machine for server applications: the JVM on OS/390” by Dillenberger et al, IBM Systems Journal, Vol 39/1, January 2000. Again this implementation uses a shared heap to share system and potentially application classes for reuse by multiple workers, with each worker JVM also maintaining a private or local heap to store data private to that particular JVM process.
One of the complications with extending the JVM in this manner relates to class loader type constraints. These type constraints identify type-safety relationships and are used to guarantee that a given named class will resolve to the same class object whenever there exists an interface dependency between two classes loaded by different class loaders. This is important in terms of the security of the JVM to prevent class spoofing (in which for example an attacker might try to supplant a system class with a rogue version to gain improper control of the JVM or application).
Thus according to the formal JAVA specification, it is possible that when two different class loaders L1, L2 initiate loading of a class or interface denoted by N, the name N may denote a different class or interface in each loader. In other words, the general rule in JAVA is that class name is unique to a given class loader, but not across the set of class loaders (or put another way, the combination of class name and class loader is unique).
However, in the situation where a class C (loaded by L1, say) makes a symbolic reference to a field or method of another class D (loaded by L2, say), the symbolic reference includes a descriptor specifying the type of the field, or the return and argument types of the method. (The type of a field can either be primitive, such as an integer, or else represent a class structure). Any type name N mentioned in the field or method descriptor must denote the same class or interface when loaded by L1 and when loaded by L2, otherwise the expected processing will not occur. In this situation, the same class implies the same class loader/class combination.
As an example of this, consider the following JAVA code (schematic only):                Class N1        
{ public N f; public void test (N t) { ....... }}                Class N2        
{ public void fred ( ) { N t; N1 a; a = new N1 ( ); a.test (t) a.f.run ( ) }}                Class N        
{ public void run ( ) { .... }}
In this example, Class N1 defines a variable f which is an object, and has type N (i.e. its format is effectively defined by class N). Class N1 further defines a method “test”, which takes an argument also of type N. This method is public (i.e. accessible to other classes). Class N2 has a method “fred” which defines a variable t, again of type N, and a variable a of type N1. A new instance of the N1 class is then created and assigned to a, and then the test method in object a is called, passing variable t as a parameter. This is an example of one class making a symbolic reference to a method of another class. Note that it is important that the type definition of variable t as type N in class N1 has the same effect as the definition of variable t as type N in class N2, otherwise the argument passed into the method will not have the correct format.
Class N2 further contains an instruction to call a method “run” in class N, by referencing variable f in object a, where object a has type N1, and in N1 variable f has type N. This is an example where a class makes a symbolic reference to a field in another class, and again, it is important that in this situation class N1 and N2 both agree on the identity of class N, otherwise the field will not be properly structured. In other words, if N1 and N2 do not locate the same class N as loaded by the same class loader, then within the context of the JVM they are dealing with different classes.
To ensure that such agreement is achieved, loading constraints of the form NL1=NL2 are imposed by the JVM during the class loading procedure. Such a loading constraint is violated if, and only if, all four of the following conditions hold:
(a) a loader L1 has been recorded by the JVM as an initiating loader of a class C named NL1;
(b) a loader L2 has been recorded by the JVM as an initiating loader of a class C′ named NL2;
(c) the set of imposed constraints implies NL1=NL2; and
(d) class C is not identical to class C′.
(There is an important distinction between the “initiating” loader and the “loading” loader. The “loading” loader is the one that actually creates a class. The initiating loader is the loader that was asked to resolve a class, and will typically first ask a parent loader if it is able to load a class before attempting to do it itself. For example, in a simple model with an application loader and the system loader, let us assume that a class Fred loaded by the application loader contains a reference to the class “java.lang.String”. The class Fred requests that the class String be resolved by the loader that created it (i.e. the application loader). The application loader first asks the system loader if it is able to resolve the class “java.lang.String”, which generally it will be able to do, as it has most likely been loaded already by this loader. In this scenario, the application loader is the “initiating” loader, as it initiated the load request. The system loader is the “loading” loader, since it physically loaded the String class. For class C to be equal to class C′, we must have the same class name loaded by the same “loading” loader, since only this will guarantee uniqueness within the JVM.
In traditional JVMs, the constraint table is managed as a large global table, and constraints between class loaders are managed as if they were peer class loaders instead of having a parent-child relationship. A constraint entry in the table consists of the class name, a resolved class object, and a list of 2 or more class loader objects associated with the constraint. The constraint table may have multiple entries for a given class name, and occasionally, two constraint entries may need to be merged as the class loading hierarchy becomes more complicated.
When a new constraint is entered into the table, the class object field is initially unresolved (null). Then, when a class loader updates its internal class cache with a new class object, it first verifies the resolved class against the constraint table (the process of resolution is discussed in more detail later). If the constraint class is still unresolved, the loader will update the constraint entry with the resolved class object. If the constraint entry has already been resolved, then the resolved constraint class must match the class about to be added to the class cache.
This single table approach has a number of deficiencies. Creating and checking the constraints requires a lot of table searching. Adding or checking a constraint between two class loaders requires a search for the class in the class cache of each class loader, plus a search of the constraint table for each of the loaders. This search is performed every time a class name is encountered in a constraint-sensitive situation. For example, every method or field reference to the class java/lang/String in a class loaded by the application loader will perform this search for every occurrence of the class name. Because the loader constraint table entries are subject to merging and reallocation, it is not possible
to maintain direct references to the loader cache entries.
In addition, constraints are tracked and maintained for non-dependent situations. If the two class loaders are peers in the class loading hierarchy, they cannot have cross loading relationships. However, the single table approach creates a constraint between the loaders whenever they have a common loading relationship that complicates the management process.
The algorithms do not scale very well on a multi-processor system. The complex updates required to the single constraint table require monitor protection for both reading and writing of the constraint information. Since there is a single table used for the entire JVM, this lock must lock out all other constraint resolution activity in the JVM. Likewise, the algorithms do not translate well to a shared classes environment as discussed above. Thus in a shared classes environment, type constraint relationships must hold true across all members of the JVM set. However, the type constraint tables resolve relationships between instances of class loaders, which may only exist in the local client JVM context. An algorithm that can efficiently incorporate the global constraint information into the local JVM context is desired, preferably once that involves as little cross-JVM locking as possible.