The present invention relates to sharing classes between first and second virtual machines, and in particular to a way in which a class loaded into a first virtual machine can be shared with a second virtual machine.
Programs written in the Java programming language (Java is a trademark of Sun Microsystems Inc) are generally run in a virtual machine environment, rather than directly on hardware. Thus a Java program is typically compiled into byte-code form, and then interpreted by the Java virtual machine (JVM) into hardware commands for the platform on which the JVM is executing. The JVM itself is an application running on the underlying operating system. An important advantage of this approach is that Java applications can run on a very wide range of platforms, providing of course that a JVM is available for each platform.
Java is an object-oriented language. Thus a Java program is formed from a set of class files having methods that represent sequences of instructions (somewhat akin to subroutines). A hierarchy of classes can be defined, with each class inheriting properties (including methods) from those classes (termed superclasses) which are above it in the hierarchy. At run-time objects are created as instantiations of these class files, and indeed the class files themselves are effectively loaded as objects. One Java object can call a method in another Java object. In recent years Java has become very popular, and is described in many books, for example xe2x80x9cExploring Javaxe2x80x9d by Niemeyer and Peck, O""Reilly and Associates, 1996, USA, and xe2x80x9cThe Java Virtual Machine Specificationxe2x80x9d by Lindholm and Yellin, Addison-Wedley, 1997, USA.
The standard JVM architecture generally supports only a single application, although this can be multi-threaded. In a server environment used for database transactions and such-like, each transaction is typically performed as a separate application, rather than as different threads within an application. This is to ensure that every transaction starts with the JVM in a clean state. In other words, a new JVM is started for each transaction (i.e. for each new Java application). Unfortunately however this results in an initial delay in running the application (the reasons for this will be described in more detail later). The overhead due to this frequent starting and then stopping of JVMs as successive transactions are processed is therefore significant, and seriously degrades the scalability of Java server solutions.
Various attempts have been made to mitigate this problem. EP-962860-A describes a process whereby one JVM can fork into a parent and a child process, this being quicker than setting up a fresh JVM. Another approach is described in xe2x80x9cOracle JServer Scalability and Performancexe2x80x9d by Jeremy Litzt, July 1999 (see: oracle.com/database/documents/jserver_scalabilit y_and_performance_twp.pdf). Thus the JServer product available from Oracle Corporation, USA, supports the concept of multiple sessions (a session effectively representing a transaction or application), each session including a JServer session. Resources such as read-only bytecode information are shared between the various sessions, but each individual session appears to its JServer client to be a dedicated conventional JVM.
U.S. patent application Ser. No. 09/304160, filed Apr. 30, 1999 (xe2x80x9cA Long Running Reusable Extendible Virtual Machinexe2x80x9d), assigned to IBM Corporation, discloses a virtual machine (VM) having two types of heap, a private heap and a shared heap. The former is intended primarily for storing application classes, whilst the latter is intended primarily for storing system classes and, as its name implies, is accessible to multiple VMs. The idea is that as each new VM is launched, it can access system classes already in the shared heap, without having to reload them, relink them, and so on, thereby saving significantly on start-up time. The shared memory can also be used for storing application classes that will be used by multiple VMs, with the private heap then being used for object instances specific to a particular application running on a VM. At termination of a VM its private heap is deleted, but the shared heap persists and remains available to other VMs.
A related idea is described in xe2x80x9cBuilding a Java virtual machine for server applications: the JVM on OS/390xe2x80x9d by Dillenberger et al, IBM Systems Journal, Vol 39/1, January 2000. This describes two types of JVM, a resource-owning JVM which loads and resolves necessary system classes, and subsequent xe2x80x9cworkerxe2x80x9d JVMs which can reuse the resolved classes. Again this implementation uses a shared heap to share system and potentially application classes for reuse by multiple workers, with each worker JVM also maintaining a private or local heap to store data private to that particular JVM process. It is suggested that worker JVMs use a common class loader to share name spaces across a set of JVMs.
However, one problem here is that a worker JVM may potentially delete all references to a class loader, which is the owner of some or all of the commonly loaded classes. This makes the loader unreachable, so that it and its classes may be garbage collected, for subsequent reloading and initialisation. Unfortunately, this is not compatible with the fact that the class loader and the classes it has loaded are effectively shared between a number of JVMs.
Accordingly, the present invention provides a method of operating a system including first and second virtual machines and having a shared memory accessible to both said first and second virtual machines, the method comprising the steps of:
loading a class within the first virtual machine =into said shared memory, said class having sharable and non-sharable data associated therewith; loading the class into the second virtual machine by locating the class within said shared memory; forming a mirror of the class within a private memory in the second virtual machine; and completing the non-sharable data associated with the class in said mirror; and
utilising the class in the second virtual machine on the basis of the sharable data from the shared memory of the first virtual machine, and the non-sharable data from the private memory of the second virtual machine.
The invention provides a master (first) virtual machine and at least one client (second) virtual machine running in parallel on the same computer system. Certain resources from the master virtual machine can be shared with the client virtual machine, thereby greatly improving efficiency. In particular the client virtual machine does not generally have to load classes already loaded by the master virtual machine. However, in accordance with the present invention it is realised that this sharing cannot be complete because certain class properties (eg initialisation statusxe2x80x94see below) may need to be set individually on each virtual machine. Thus for each class to be shared, a mirror version is created on a client virtual machine. The mirror version of the class is effectively a composite in that certain class data is stored locally on the client virtual machine, whilst other class data is actually shared with the copy of the class loaded into shared memory by the first virtual machine. It will be appreciated that overall the mirror version contains essentially the same data elements as the original class in the first virtual machine (since both are derived ultimately from the same class file). Those elements for which the master and client will contain identical data can be shared, and only a single copy in the master need be maintained. Those elements which either do or are liable to contain different data in the client from the master must be present locally in the mirror itself. It will be appreciated therefore that the mirror is not a strict duplicate of the class loaded into the master. Rather it is the same class loaded into a different virtual machine (and so may have different data element values, etc), but which reuses certain data from the class as loaded into the master virtual machine.
The use of the mirror or shell version means that a class which has previously been loaded into the master virtual machine can beloaded much more quickly into the client virtual machine than if it had to be loaded completely afresh. Thus firstly only the non-shared data needs to be stored into the second virtual machine, and secondly even this non-shared data can be derived from the class in the first virtual machine or produced internally by the second virtual machine (in other words no need to go back to the class file somewhere in external storage). At the same time the creation and presence of the mirror class makes it transparent that the class has not in fact been loaded conventionally.
A typical configuration for the above approach is to have a single master and multiple clients. In such an arrangement, the clients generally perform most of the work, to minimise any risk of the master falling over (which would then render the shared data unavailable, and essentially invalidate all the mirror classes). It is also convenient with this approach for garbage collection to be disabled on the master virtual machine.
In the preferred embodiment the step of loading the class into the second virtual machine further includes updating a class loader cache in the second virtual machine. This further emulates conventional class loading.
Generally the first and second virtual machines both include a hierarchy of class loaders. At least one class loader can be designated a shared class loader. Each shared class loader in the class loader hierarchy in the second virtual machine has a corresponding shared class loader in the equivalent position in the class loader hierarchy in the first virtual machine. This helps to ensure conformity, for example in terms of security, between the original version of the class in the first virtual machine, and the mirror version in the second virtual machine. In order to maintain this situation, in the preferred embodiment, before creating a shared class loader in the second virtual machine, it is checked that an instance of the shared class loader does not already exist on the second virtual machine, and that there is a corresponding shared class loader in the first virtual machine.
Note that a shared class loader in the first virtual machine does not necessarily have to have a counterpart in the second virtual machinexe2x80x94only in this case classes loaded by this shared class loader will not be available for sharing with the second virtual machine. In addition, the master and client virtual machines can have non-shared class loaders which operate in conventional fashion.
In the preferred embodiment a shared class is loaded into the second virtual machine by walking the class loader hierarchy on the second virtual machine to determine for each class loader in the hierarchy whether it has previously loaded the class. This determination is performed (as conventionally) on the basis of said class loader cache in the second virtual machine. If the class has not been previously loaded and is to be loaded by a shared class loader, it is then determined whether the class has been loaded into shared memory on the first virtual machine (this can be done most simply by checking the class loader cache on the master). If it is determined that the class has not been loaded into shared memory on the first virtual machine, the client causes the class to be loaded into the shared memory of the first virtual machine. Once this has been performed, a mirror version of this class can then be created on the second virtual machine.
In the preferred embodiment the non-sharable data associated with the mirror version of the class includes an identifier of the class loader which loaded the class (into the second virtual machine) and an initialisation flag. The initialisation flag in the non-sharable data is set to the non-initialised state when the class is first loaded into the second virtual machine. This provides separate initialisation of the class in each virtual machine in accordance with standard environment requirements. The non-sharable data associated with the mirror version of the class further includes at least part of the method block and field data associated with the class. The exact amount of data which needs to be mirrored is dependent on precise virtual machine implementation and platform considerations. In the preferred embodiment for simplicity all of the method block and field data are mirrored, even although typically this is not strictly required for all fields. In contrast the (non-native) method code is included in the sharable data associated with the class and so only stored in the first virtual machine (i.e. no mirrored version in the second virtual machine). This is because the second virtual machine can use this method code directly from the shared memory, and does not need its own copy.
Preferably the method further comprises the steps of de-referencing all the mirrors of classes loaded into the second virtual machine by a particular shared class loader on that machine; and allowing the particular shared class loader to be garbage collected on the second virtual machine. In other words a class loader and associated classes can be effectively removed from the second virtual machine, even whilst the original version of the classes may still be present in the first virtual machine.
It will be appreciated that all the above operations, such as for locating and creating the mirror class, for monitoring shared class loaders, and so on, are essentially transparent to the application and the class loaders (at the application level). Rather they are performed by the underlying virtual machine implementation. Thus from the perspective of the application, the dependence of the client on the master is largely hidden, and the second virtual machine behaves effectively as a standalone system.
The invention further provides a computer program product, comprising computer program instructions typically recorded onto a storage medium or transmitted over a network, for implementing the above methods.
The invention further provides a computing system including first and second virtual machines and having a shared memory accessible to both said first and second virtual machines, the system further comprising:
means for loading a class within the first virtual machine into said shared memory, said class having sharable and non-sharable data associated therewith;
means for loading the class into the second virtual machine by locating the class within said shared memory; means for forming a mirror of the class within a private memory in the second virtual machine; and means for completing the non-sharable data associated with the class in said mirror; and
means for utilising the class in the second virtual machine on the basis of the sharable data from the shared memory of the first virtual machine, and the non-sharable data from the private memory of the second virtual machine.
The invention further provides a computing system including:
a first virtual machine;
a second virtual machine;
a shared memory accessible to both said first and second virtual machines;
a private memory accessible to the second virtual machine;
at least one class loaded in the first virtual machine into said shared memory, said class having sharable and non-sharable data associated therewith; a mirror of said at least one class loaded in the second virtual machine into said private memory, said mirror including the non-sharable data associated with the at least one class;
wherein the at least one class is utilised by the second virtual machine on the basis of the sharable data from the shared memory of the first virtual machine, and the non-sharable data from the private memory of the second virtual machine.