The invention relates to a computer system supporting an object-oriented environment having storage, at least a portion of which is divided into multiple heaps.
Programs written in the Java programming language (Java is a trademark of Sun Microsystems Inc) are generally run in a virtual machine environment, rather than directly on hardware. Thus a Java program is typically compiled into byte-code form, and then interpreted by a Java virtual machine (JVM) into hardware commands for the platform on which the JVM is executing. The JVM itself is an application running on the underlying operating system. An important advantage of this approach is that Java applications can run on a very wide range of platforms, providing of course that a JVM is available for each platform.
Java is an object-oriented language. Thus a Java program is formed from a set of class files having methods that represent sequences of instructions (somewhat akin to subroutines). A hierarchy of classes can be defined, with each class inheriting properties (including methods) from those classes which are above it in the hierarchy. For any given class in the hierarchy, its descendants (i.e. below it) are call subclasses, whilst its ancestors (i.e. above it) are called superclasses. At run-time objects are created as instantiations of these class files, and indeed the class files themselves are effectively loaded as objects. One Java object can call a method in another Java object. In recent years Java has become very popular, and is described in many books, for example xe2x80x9cExploring Javaxe2x80x9d by Niemeyer and Peck, O""Reilly and Associates, 1996, USA, and xe2x80x9cThe Java Virtual Machine Specificationxe2x80x9d by Lindholm and Yellin, Addison-Wedley, 1997, USA.
The standard JVM architecture is generally designed to run only a single application, although this can be multi-threaded. In a server environment used for database transactions and such-like, each transaction is typically performed as a separate application, rather than as different threads within an application. This is to ensure that every transaction starts with the JVM in a clean state. In other words, a new JVM is started for each transaction (i.e. for each new Java application). Unfortunately however this results in an initial delay in running the application (the reasons for this will be described in more detail later). The overhead due to this frequent starting and then stopping a JVM as successive transactions are processed is significant, and seriously degrades the scalability of Java server solutions.
Various attempts have been made to mitigate this problem. EP-962860-A describes a process whereby one JVM can fork into a parent and a child process, this being quicker than setting up a fresh JVM. The ability to run multiple processes in a Java-like system, thereby reducing overhead per application, is described in xe2x80x9cProcesses in KaffeOS: Isolation, Resource Management, and Sharing in Javaxe2x80x9d by G back, W Hsieh, and J Lepreau.
Another approach is described in xe2x80x9cOracle JServer Scalability and Performancexe2x80x9d by Jeremy Litzt, July 1999. The JServer product available from Oracle Corporation, USA, supports the concept of multiple sessions (a session effectively representing a transaction or application), each session including a JServer session. Resources such as read-only bytecode information are shared between the various sessions, but each individual session appears to its JServer client to be dedicated conventional JVM.
U.S. patent application Ser. No. 09/304,160, filed 30 Apr. 1999, now U.S. Pat. No. 6,694,346 (xe2x80x9cA long Running Reusable Extendible Virtual Machinexe2x80x9d), assigned to IBM Corporation (IBM docket YOR9-1999-0170), discloses a virtual machine (VM) having two types of heap, a private heap and a shared heap. The former is intended primarily for storing application classes, whilst the latter is intended primarily for storing system classes and, as its name implies, is accessible to multiple VMs. A related idea is described in xe2x80x9cBuilding a Java virtual machine for server applications: the JVM on OS/390 by Dillenberger et at, IBM Systems Journal, Vol 39/1, January 2000. Again this implementation uses a shared heap to share system and potentially application classes for reuse by multiple workers, with each worker JVM also maintaining a private or local heap to store data private to that particular JVM process.
The above documents are focused primarily on the ability to easily run multiple JVMs in parallel. A different (and potentially complementary) approach is based on a serial rather than parallel configuration. Thus it is desirable to run repeated transactions (i.e. applications) on the same JVM, since this could avoid having to reload all the system classes at the start of each application. However, one difficulty with this is that each application expects to run on a fresh, clean, JVM. There is a danger with serial re-use of a JVM that the state left from a previous transaction somehow influences the outcome of a new transaction. This unpredictability is unacceptable in most circumstances.
U.S. patent application Ser. No. 09/584,641 filed 31 May 2000 xe2x80x9cpendingxe2x80x9d in the name of IBM Corporation (IBM docket number GB9-2000-0061) discloses an approach for providing a JVM with a reset capability. U.S. provisional application No. 60/208,268 also filed 31 May 2000 in the name of IBM Corporation (IBM docket number YOR9-2000-0359) discloses the idea of having two heaps in a JVM. One of these is a transient heap, which is used to store transaction objects that will not persist into the next transaction, whilst a second heap is used for storing objects, such as system objects, that will persist. This approach provides potentially an ongoing process throughout the running of a program.
Accordingly the invention provides a computer system providing an object-based environment, said computer system including storage, at least a portion of which is logically divided into two or more heaps in which objects can be stored, each heap being subdivided into slices of memory, said system including a two-level lookup structure for determining whether a given storage address corresponds to a particular heap, said lookup structure comprising:
a first level having one or more lookup substructures, each corresponding to a unit of memory representing a predetermined number of slices, and indicating for each of these slices the particular heap, if any, that the slice belongs to; and
a second level providing means for determining for a given memory address the first level lookup substructure that includes the slice containing that address.
In the preferred embodiment, a given memory address is processed by firstly using the second level lookup means to determine the relevant first level lookup substructure, and secondly using the determined first level lookup substructure to identify the heap, if any, that the slice containing the given memory address belongs to. This provides a very quick mechanism for determining, for any given memory address, which heap that address is in (if any). This is particularly useful, for example, in a situation where the properties of the heaps, such as their garbage collection characteristics, are different.
Preferably, storage is only added to or removed from a heap in terms of an integral number of slices, to avoid the complexity of handling part slices. Typically the size of a slice is significantly greater than the minimum size of an object on the heap, thereby reducing the overall size of the lookup structure.
In the preferred embodiment, each first level lookup substructure comprises a linear array of bytes, with the Nth byte in the array corresponding to the Nth slice in the first level lookup substructure, and identifying which particular heap it belongs to. Similarly the second level lookup means comprises a linear array of pointers, in which the Nth pointer in the array references the first level lookup substructure corresponding to the Nth memory unit. This configuration ensures that the relevant portion of the lookup structure can be accessed very quickly indeed.
Also in the preferred embodiment, a first level lookup substructure is only created if at least one slice corresponding thereto belongs to a heap, thereby avoiding unnecessary work and demands on storage space. A pointer in the second level array can then be set to a null value if none of the corresponding slices belongs to a heap. Note that it is expected that in practice most of the array will comprise null pointers, in other words, relatively few first level lookup substructures will be created.
The invention further provides a method of operating a computer system providing an object-based environment, said computer system including storage, at least a portion of which is logically divided into two or more heaps in which objects can be stored, each heap being subdivided into slices of memory, said system including a two-level lookup structure for determining whether a given storage address corresponds to a particular heap, said method comprising the steps of:
providing a first lookup level having one or more lookup substructures, each corresponding to a unit of memory representing a predetermined number of slices, and indicating for each of these slices the particular heap, if any, that the slice belongs to; and
providing a second lookup level for determining for a given memory address the first level lookup substructure that includes the slice containing that address.
The invention further provides a computer program product comprising instructions encoded on a computer readable medium for causing a computer to perform the method described above. A suitable computer readable medium may be a DVD or computer disk, or the instructions may be encoded in a signal transmitted over a network from a server.
It will be appreciated that the method and computer program product of the invention will benefit from the same preferred features as the system of the invention.