1. Technical Field
This invention generally relates to virtual storage and more specifically relates to a method and apparatus for providing shared persistent virtual storage on 32 bit operating systems.
2. Background Art
The development of the EDVAC computer system of 1948 is often cited as the beginning of the computer era. Since that time, computer systems have evolved into extremely sophisticated devices, and computer systems may be found in many different settings. The widespread proliferation of computers prompted the development of computer networks that allow computers to communicate with each other. With the introduction of the personal computer (PC), computing became accessible to large numbers of people. Networks for personal computers were developed that allow individual users to communicate with each other.
Computer systems typically include operating system software that controls the basic function of the computer, and one or more software applications that run under the control of the operating system to perform desired tasks. For example, a typical IBM Personal Computer may run the OS/2 operating system, and under the control of the OS/2 operating system, a user may execute an application program, such as a word processor. As the capabilities of computer systems have increased, the software applications designed for high performance computer systems have become extremely powerful.
Object-oriented programming based on an object model is a new way of creating computer programs that has become very popular over the past several years. The goal of using object-oriented programming is to create small, reusable sections of program code known as objects that can be quickly and easily combined and re-used to create new programs. By creating and re-using a group of well-tested objects, a more stable, uniform, and consistent approach to developing new computer programs can be achieved.
One particular object-oriented programming language is Java which is specifically designed to create distributed object systems. Java offers many features and advantages that makes it a desirable programming language to use. First, Java is specifically designed to create small programs, commonly called applets, that can reside on the network in centralized servers, and delivered to the client machine only when needed. Second, Java is an interpreted language. A Java program can be written once and ran on any type of platform that contains a Java Virtual Machine (JVM). Thus, Java is completely platform independent. And third, Java is an object-oriented language, meaning that software written in Java can take advantage of the benefits of object-oriented programming.
One issue in object oriented programming, and Java programing in particular, is object persistence. Typically, when a process ends, all of the references to objects created by the process are lost, and the objects themselves are destroyed. These objects are referred to as transient objects. Persistent objects, as opposed to transient objects, have a lifetime that transcends the lifetime of the process that created them. To make an object persistent, mechanisms must be put in place to allow a persistent object to survive beyond the lifetime of the process from which the object was created so that other processes can access the object. This typically involves the storing of the objects onto permanent storage devices, such as hard disks, optical disks, tape drives, etc.
Many common computer systems use an addressing scheme referred to as the two level storage (TLS) model. The TLS model stores and manipulates data using two systems: a file manager and a virtual memory system. The virtual memory includes the actual memory and a specialized data file called a swap file. The virtual memory system controls the allocation of address space to different processes. The file manager stores and retrieves data from the permanent storage devices in the form of files.
One benefit to using the TLS model is that it provides large amounts of address space to fixed size address machines. This large address space is useful for storing large amounts of information. However, the TLS models does not provide a way for storing persistent objects in a way that can be efficiently used by a computer system.
In particular, in a TLS system persistent data, such as persistent objects, must be stored in files on a disk or other storage medium by the file manager. When a process needs to access a persistent object, the process must contact the file manager which locates the persistent object data in a file on backing store and move a copy of the persistent object data into a memory buffer. The persistent object data must then be reconstructed into a persistent object in memory.
When the data being stored by the TLS model is persistent, several other problems are introduced. For example, persistent objects typically contain many pointers used for sharing data and methods with other objects. When a persistent object is retrieved from backing store and a new runtime representation of the object is created, any internal pointers contained within the persistent object must be converted and rebuilt. Rebuilding persistent objects, and more specifically, converting pointer references contained within persistent objects results in significant overhead on the CPU.
Some TLS systems use externalization techniques to store persistent objects. These techniques pull data from the object and write the externalized data to a data file. When the object is needed again, the data must be read in by the file system and the persistent object recreated using state data. This processes requires reading from the file system to a memory buffer, and copying data from the memory buffer to the object. This also creates significant unwanted CPU overhead.
Another addressing scheme is the single level storage (SLS) model. The SLS system maps all of the storage mediums, including persistent storage mediums such as hard drives, into a single address space. This makes the entire data storage system into a single xe2x80x9cvirtual memoryxe2x80x9d that is accessed directly using a single, process independent, address space. In an SLS system, persistent data from the permanent storage mediums can be easily copied to real memory using the virtual memory addressing.
The SLS model is very efficient means of making and manipulating persistent objects because it reduces the amount of system overhead to load persistent objects into memory. As mentioned, when a process needs a persistent object, the persistent object is copied from persistent storage and into real memory. This involves creating only one copy rather than the two copies created by the TLS model. Because pointer references are process independent, SLS system allows persistent objects, including pointer references contained within persistent objects, to be moved into and from persistent storage without requiring any rebuilding of the persistent object. When a process needs to access a persistent object, the SLS model simply moves this persistent object from the backing store into memory. Stated another way, the SLS model eliminates the need to create a runtime representation of a persistent object since pointer references contained within the persistent object always remain valid. Another advantage to SLS systems is that pointers to objects are process independent. This allows easy sharing of persistent objects between multiple processes.
The SLS model has been successfully implemented using 48 and 64 bit memory addresses since 1980 (IBM System/38 and AS/400). However, many business applications today run on desktop PC""s which have 32 bit memory addresses. Because a SLS system maps all data storage into a single virtual memory, a very large address space is required. A 32 bit SLS system would simply not be large enough to store the enormous amounts of data generated by large business applications. Thus, programmers and computer users of commodity 32-bit systems continue to have no easy way to provide object persistence.
Without a more efficient mechanism for providing efficient creation, access and sharing of persistent objects in interpreted computer systems such as Java, the computer industry will continue to not realize the full potential of the these systems.
A preferred embodiment of the present invention provides an intelligent reference object (IRO), which is used to encapsulate address translation between shared address space (SAS) addresses and native system addresses. The IRO works with a shared persistent virtual storage system that provides address translation between SAS addresses and the underlying system. By encapsulating these addresses translations in an IRO, the preferred embodiment provides the ability to create and share persistent objects using single level store semantics. When a client accesses the data in a persistent object; the IRO corresponding to the persistent object provides any address translation and indirection needed to perform the access.
The foregoing and other features and advantages of the present invention will be apparent from the following more particular description of the preferred embodiment of the invention, as illustrated in the accompanying drawings.