1. Field of the Invention
The present invention relates to the field of virtual machines, and more particularly to a system and method for providing process persistence in a virtual machine are described.
2. Description of the Related Art
The problem of migrating a running process, for example, an application, from one machine to another on a network has been tried for years, and there is much research literature on the subject of xe2x80x9cprocess migration,xe2x80x9d but not much success in actually solving this difficult problem.
Currently, with the world moving towards a network centric model of computing, with unprecedented connectivity, there is a growing need to run an application (editor, email, browser, etc.) on one computer, and to be able to later resume running that same application from another machine in another location. Such a need can only be fulfilled via application migration. At the same time, modern operating systems have become very complex, and tend to have multiple applications running on a very thick client, and this complexity has resulted in much unreliability. It""s thus desirable to be able to separate an application from the rest of the complex operating system, and persist it somewhere on the net, where it is protected from the complex, thick client system. This need, as well, can only be fulfilled via persistent application migration.
Java(trademark)
The computer world currently has many platforms, among them Microsoft Windows(copyright), Apple Macintosh(copyright), OS/2, UNIX(copyright), Linux and NetWare(copyright). Software must be compiled separately to run on each platform. The binary file for an application that runs on one platform cannot run on another platform, because the binary file is platform-specific.
A xe2x80x9cvirtual machinexe2x80x9d may be defined as an operating environment that sits on top of one or more other computer platforms, and provides the capability to run one binary file on the virtual machine on the one or more other computer platforms. Thus, an application is written and compiled to run on the virtual machine, and thus does not need to be compiled separately to run on the one or more other computer platforms.
The Java Platform is a software platform for delivering and running applets and applications on networked computer systems. What sets the Java Platform apart is that it sits on top of other platforms, and executes bytecodes, which are not specific to any physical machine, but are machine instructions for a virtual machine. A program written in the Java Language compiles to a bytecode file that can run wherever the Java Platform is present, on any underlying operating system. In other words, the same file can run on any operating system that is running the Java Platform. The Java Platform has two basic parts, the Java Virtual Machine and the Java Application Programming Interface (Java API).
The Sun Java technologies are grouped into three editions: Java 2 Micro (J2ME), Standard (J2SE), and Enterprise (J2EE) Editions. Each edition includes a Java Virtual Machine (JVM) that fits inside a range of consumer devices such as set-top, screenphone, wireless, car, and digital assistant devices J2ME specifically addresses the consumer space, which covers the range of small devices from smart cards and pagers up to the set-top box, an appliance almost as powerful as a computer. The consumer devices targeted by J2ME, such as set-top boxes, printers, copiers, and cellular phones, typically have fewer resources and more specialized functionality than a typical Network Computer. Such devices may have special constraints such as small memory footprint, no display, or no connection to a network. The J2ME API provides the smallest Java API one of these limited devices can have and still run. A Java-powered application written for one particular device may operate on a wide range of similar devices. Applications written with J2ME are upwardly scalable to work with J2SE and J2EE.
Java Remote Method Invocation (RMI)
RMI is a Java programming language-enabled extension to traditional remote procedure call mechanisms. RMI allows not only data to be passed from object to object around the network but full objects, including code.
K Virtual Machine (KVM)
The K Virtual Machine (KVM) is a Java runtime environment that is an extremely lean implementation of the Java virtual machine for use in devices that have a small memory footprint. The KVM is the core of the Java 2 Micro Edition (J2ME). The KVM is suitable for 16/32-bit RISC/CISC microcontrollers with a total memory of no more than a few hundreds of kilobytes (Kbytes) and sometimes less than 128 Kbytes of RAM. This typically applies to small-footprint memory devices, including digital cellular phones, pagers, mainstream personal digital assistants, low-end analog set-top boxes, and small retail payment terminals.
Application Migration and Java
By writing an application in Java, the application is not tied to a particular machine, but is rather written to run on an abstract or xe2x80x9cvirtualxe2x80x9d machine, the Java Virtual Machine (JVM). Consequently, it is possible for the application to run on any machine on the network that implements the JVM specification. This aids in process migration, because past attempts at this problem have been largely foiled by differences, even slight ones, among the various machines on a network where an application is intended to migrate and run. By itself, though, an application written in Java cannot migrate from one machine on a net to another, because once the application starts running, it runs only in the heap of the JVM on which it initially started.
The Java language provides the programmer with an object model, a strong type system, automatic main memory storage management and concurrency through lightweight threads. However, the Java platform provides no satisfactory way of maintaining these properties beyond the single execution of a JVM. Instead, the programmer must deal explicitly with saving the state of an application, using one of a variety of persistence mechanisms, for example, file input/output, object serialization or relational database connectivity, none of which approach complete support for the full computational model. This lack of completeness, while only a minor nuisance for simple applications, becomes a serious problem as application complexity increases.
Orthogonal Persistence for Java
Orthogonal persistence for the Java platform (OPJ) addresses some of the limitations of application migration with Java with no changes to the source language and minor modifications to the specification of the Java Virtual Machine life cycle. In effect, orthogonal persistence extends the automatic memory management of the Java platform to encompass stable memory.
OPJ allows a running Java application to persist with no change to the application or to Java (thus orthogonal). This is achieved by enhancements to the JVM that implement a persistent heap that parallels the heap that Java code runs in. It is possible to suspend a running application and have a checkpoint result in the persistent heap that can later be reactivated on that same JVM. However, migrating to another JVM on another machine is not supported.
Another limitation of the persistent heap and checkpointing as implemented in OPJ is that any portions of a process that are dependent upon external state and not transient may be invalid when the code runs again, because the actual external state may have changed. An example of an external state is a socket for a network connection.
Yet another limitation of the persistent heap and checkpointing as implemented in OPJ is that it supports one large persistent heap for all Java code running on the system, making it difficult to separate out one particular application to migrate to another node. The persistent heap may include system Java objects and application Java objects. System Java objects are those Java objects tied to the platform (machine and operating system) on which the JVM is executing with the Java Native Interface (JNI). System Java objects may include native methods for the platform on which the JVM is executing. The application Java objects for the particular application would have to be separated from the application Java objects from any other running process and from the system Java objects.
Still yet another limitation of the OPJ model is that it requires two separate garbage collectors, one for the xe2x80x9cin-memoryxe2x80x9d heap and one for the persistent heap.
JVM Separation Models
In a system providing application migration, it would be desirable to separate an application so that only it runs in a heap (and is persisted in a persistent heap). One way to do this is to start a separate JVM on the machine for each application. Although simple, the approach may not be practical. For one thing, this solution uses many system resources. Other approaches for application separation are hierarchical, with one xe2x80x9crealxe2x80x9d JVM and many xe2x80x9cvirtualxe2x80x9d JVMs multiplexed on top. It would be desirable to provide a virtual machine separation model that separates applications into discrete persistent stores, permits the running of applications one at a time in an in-memory heap, and that does so without requiring the running of multiple copies (real or virtual) of the JVM.
The problems outlined above may be solved in large part by a system and method for persistent application migration that provides application separation and a method of maintaining the properties of a process beyond the single execution of a virtual machine such as a Java Virtual Machine (JVM) while preserving the external state of the process.
In one embodiment, an application on a system may be separated from other applications and from system code and data, and thus migratable separately from the other applications. In one embodiment, one or more applications on a system may each have an in-memory heap serving as xe2x80x9cphysicalxe2x80x9d memory that is being used for the current execution of the application, a virtual heap that may include the entire heap of the application including at least a portion of the runtime environment, and a persistent heap or store where the virtual heap can be checkpointed. The virtual heap and the persistent heap may be combined in one memory (the virtual heap may serve as the persistent heap). Alternatively, the virtual heap may be checkpointed to a separate, distinct persistent heap. The combination of the in-memory heap, the virtual heap, and the persistent store may be referred to as the xe2x80x9cvirtual persistent heap.xe2x80x9d
A heap may include code and data for use by the application. In object-oriented programming languages such as Java, at least some of the code and data in the heap for the application may be encapsulated in objects. Objects may be defined as structures that are instances of a particular class or subclass of objects. Objects may include instances of the class""s methods or procedures (code) and/or data related to the object. An object is what actually xe2x80x9crunsxe2x80x9d in an object-oriented program in the computer.
A heap may also include structures for managing the application""s code and data in the heap. For example, a heap may be divided into sections, for example pages or cache lines. The sections of the heap may be grouped into sets of two or more sections for some heap processing functions such as garbage collection. Sections of the heap may include structures for managing code and data (objects) in the section. For example, one or more structures for tracking internal and external references to objects in a section may be kept in the sections of memory. An internal reference to an object may be defined as a reference to an object from another object in the same section of the heap. An external reference may be defined as a reference to an object from another object in another section of the heap.
In one embodiment, an application may establish one or more leases to local and/or remote services external to the application. In one embodiment, an application may establish one or more leases to system code that give the application access to resources external to the application such as system resources. System code for accessing an external resource may be referred to as a system service. A lease on system code for accessing an external resource may be referred to as a leased system service. For example, an application may establish leases to system services that give the application access to system drivers for accessing communications ports in the system.
In a virtual persistent heap, the entire heap may be made persistent. The virtual persistent heap may enable the checkpointing of the state of the computation of the virtual machine to a persistent storage such as a disk or flash device for future resumption of the computation at the point of the checkpoint. The Virtual Persistent Heap also may enable the migration of the virtual machine computation states from one machine to another. Both the data and computation state may be migrated. One embodiment may also provide for the suspension and resumption of an application, such as upon restarting a device after an intentional or unintentional shutdown of the device.
The virtual persistent heap may enable the saving of the entire state of the virtual machine heap for possible future resumption of the computation at the point the save was performed, thus enabling the migration of the computation to a different system. The saved state of the virtual machine heap may also provide the ability to restart the virtual machine after a system crash or shutdown to the last saved persistent state. This persistent feature is important for small consumer and appliance devices including Java-enabled devices, such as cellular phones and Personal Digital Assistants (PDAs), as these appliances may be shutdown and restarted often. In one embodiment, the virtual persistent heap may include the entire address space of the virtual machine heap an application is using.
Embodiments of a virtual persistent heap may include a method for caching portions of the virtual persistent heap into the physical heap. In one embodiment, the virtual persistent heap may include a caching mechanism that is effective with small consumer and appliance devices that typically have a small amount of memory and that may be using flash devices as persistent storage. The caching mechanism may provide a reduced amount of caching and may help to improve locality among elements of the virtual persistent heap that are cached in the physical heap, thus minimizing caching overhead. In one embodiment, the virtual persistent heap may be divided into cache lines. A cache line may be the smallest amount of virtual persistent heap space that can be loaded or flushed at one time. Caching in and caching out operations ma bye used to load cache lines into the heap or to flush dirty cache lines to the store.
Garbage Collection Method for a Virtual Persistent Heap
A garbage collection method may be provided for the virtual persistent heap. In one embodiment, the garbage collection method may be used with small consumer and appliance devices, for example, Java-enabled devices, which may have a small amount of memory and may be using flash devices as persistent storage. In one embodiment, the garbage collection method may be implemented to provide good performance where only a portion of the virtual persistent heap may be cached in the physical heap.
In one embodiment, running low on heap space may trigger garbage collection. In one embodiment, the garbage collection method may start at the root of the heap and flag objects that are referenced (i.e. need to be kept in the heap). Then, objects not flagged may be removed from the heap. Alternatively, the garbage collection method may flag objects that are not referenced, and then may remove the flagged objects. Garbage collection may cause the heap to become fragmented so that a large object may not fit in available free space. The garbage collection method thus may include a compaction phase to reduce or substantially eliminate fragmentation and to improve object locality.
In one embodiment, the virtual persistent heap may use a single address space for objects in the store and objects in the in-memory heap. In one embodiment, a single garbage collector may be provided to run on the entire virtual heap address space. The virtual persistent heap may use a single garbage collector that may be advantageous in light of memory and CPU device constraints on small appliance and consumer devices. In one embodiment, object collections performed on the virtual persistent heap may be propagated to the store when the corresponding cache lines are flushed to the store. Cache line evictions may allow the freeing of heap space when required.
In one embodiment, the virtual heap may enable the running of applications that require a bigger than available in-memory heap. In one embodiment, the amount of caching may be tracked, and a garbage collection cycle may be induced in response to the tracking of the amount of caching. A garbage collection cycle may help reduce the amount caching by removing currently unused objects from the virtual heap space and allowing a compaction phase to improve object locality by compacting correlated objects into the same cacheable section(s) of the heap.
Small appliance and consumer devices may use flash devices for non-volatile memory storage. Flash devices typically have special characteristics, such as large write I/O blocks (for example, 128 Kbytes) and destructive writes. In one embodiment, the number of writes performed to a flash device by the garbage collector is minimized to increase the life of the flash device. The garbage collector for the virtual persistent heap may be implemented using working sets and/or object nurseries for short life objects.
If a garbage collection method walks through the entire virtual heap address space in a cycle, a large burst of cache load and flushing requests may be generated, particularly when the in-memory in-memory heap is much smaller than the virtual heap. In one embodiment, a generational garbage collector may be used, where each generation is confined to a portion (working set) of the heap. The garbage collection cycle for the entire virtual persistent heap may comprise generations of garbage collection on the generational working sets. Each garbage collection generation may touch disjoint areas of the heap. In one embodiment, a portion of the heap may be shared to store inter-working set dependencies. In one embodiment, most inter-object references may be confined to the working set region. In one embodiment, the garbage collection generations may run at fixed intervals. Alternatively, the garbage collection generations may run at varying intervals.
In one embodiment, a heap allocator may combine related objects in the same working set region. In one embodiment, the generational garbage collector may allow the flushing of changes after each garbage collection generation for each working set region, and thus may avoid the caching burst of a garbage collection method that walks the entire virtual heap in one cycle. In one embodiment, the cache load and eviction may be spread across multiple garbage collection generations.
In one embodiment, in conjunction with the working set based garbage collector, heap regions with different flushing policies may be used. In one embodiment, one or more object nursery regions where objects may be initially created for use by the application may be used. In one embodiment, a number of short-lived objects may be created during the execution of an application. A relatively small number of these objects may end up in the persistent store. Using a nursery region, which also may be referred to as an object nursery region, to hold short-lived objects may avoid unnecessary flushing of the short-lived objects. When garbage collection is performed, referenced objects in the object nursery region may be copied into in-memory heap regions to be flushed from the in-memory heap to the store heap. In one embodiment, the nursery region may be comprised in the in-memory heap. In another embodiment, the nursery region may be outside the in-memory heap.
In some embodiments, there may be a plurality of nursery regions. In one embodiment with a plurality of nursery regions, the nursery regions may be a hierarchy, with new objects created in a first region, and, as an object persists, it may be moved up in the hierarchy of nursery regions. Moving objects up the hierarchy of nursery regions may be performed in a garbage collection cycle. When garbage collection is performed, referenced objects in the highest object nursery region in the hierarchy may be copied into in-memory heap regions to be flushed from the in-memory heap to the store heap.
During a garbage collection cycle, the default flushing mechanism of the virtual persistent heap may be disabled until the garbage collection cycle is completed. Since a garbage collection cycle is likely to change the heap state many times in the process (updating heap structures), there may be no advantage to generating a store checkpoint in the middle of a garbage collection cycle. For instance, a cache line may be updated many times during a cycle. In one embodiment, the method may wait until the garbage collection cycle is completed to commit a new store checkpoint.