Some computer applications, such as CAD/CANM applications, typically construct, taintain, access, and modify large set of data objects ("objects") over a substantial period of time using a single computer system, or using a number of connected computer systems. It is common for such objects to contain references to other objects in the object set. Such applications often use persistent object systems to maintain these objects and make them available for access and modification on any of the connected computer systems. Persistent object systems ensure the continuing availability of persistent objects by storing them in a non-volatile manner in an object server, such as a database or a filesystem, while allowing persistent objects to be moved into a computer system's main memory to be accessed and manipulated by programs executing on the computer system.
When a program executing on a computer system finishes accessing and modifying an object in its main memory, the persistent object system transfers the object to the object server to store the object in a non-volatile manner. The transferred object may contain references to other objects in the object set. At the time of transfer, these references generally each comprise a pointer to an address in the main memory of the same computer system into which the referenced object has been loaded. Such references depend on both the identity of the computer system, which is not reflected by the pointer, as well as the specific contents of the main memory of the computer system, which may be completely different the next time a program transfers the transferred object from the object server. If the object server later provides the version of the object containing main memory pointers to a program on another computer system, the main memory pointers in the object will be invalid. As part of the process of transfering the object to the object server, therefore, the persistent object system performs a process called "passivation." Passivation involves replacing the main memory pointer references in the passivated object used to locate the referenced objects in the main memory of the current computer system with the persistent pointers used by the persistent object system to locate the referenced objects in the object server. (Persistent pointers are also called "object identifiers" (OIDs), and may be represented using global unique identifier (GUID) data structures.) Replacing a main memory pointer with a persistent pointer in this manner is called "unswizzling" the main memory pointer.
When a program later uses the persistent object system to access or modify the object transferred to the object server, the persistent object system transfers the object from the object server to the main memory of the computer system on which the program is executing and performs a "depassivation" process. Depassivation involves replacing the persistent pointers in the transferred object, which cannot generally be used by the program to access and modify the referenced objects referred to by the persistent pointers, with main memory pointers that the program can use to access and modify the referenced objects. Replacing a persistent pointer with a main memory pointer in this manner is called "swizzling" the persistent pointer.
Conventional swizzling techniques fall into three categories, each of which has significant disadvantages. Hardware swizzling uses the paging system of an existing virtual memory management system to load referenced objects when main memory pointers to them are resolved by the program using the depassivated object. Virtual memory managers divide a main memory address space that is larger than actual main memory into pages of a fixed length. Some of the pages in the main memory space are actually represented in the main memory, while others are "paged out." When a program attempts to dereference a pointer to a page that is paged out, the attempt generates a hardware interrupt called a "page fault." Page faults are handled by an interrupt service routine that "pages in" the faulted page by finding room for it in main memory (in most cases by paging out another page), loading the faulted page from disk into main memory, marking the faulted page as paged in, and allowing the dereferencing operation to proceed.
According to the hardware swizzling technique, when the depassivated object is loaded, each persistent pointer is replaced with a main memory pointer to a "ghost page," which is marked as paged out. When a program attempts to dereference a main memory pointer to a ghost page, a page fault is generated, and a modified page fault interrupt handling routine loads the referenced object from the object server using its persistent pointer, marks the ghost page as paged in, and allows the dereferencing operation to proceed. Subsequent attempts to dereference the main memory pointer proceed without further delay. Hardware swizzling has the advantage that referenced objects are not loaded until they are actually accessed. Hardware swizzling also has two important disadvantages, however: (1) Page faulting and paging in are expensive operations, taking roughly as much time as executing 2000 instructions on some processors. (2) The fixed page size used by virtual memory management systems is ill-suited for storing variable-size objects--it causes a section of main memory larger than the loaded object to be devoted to the loaded object, and cannot accommodate objects that grow in size over time to exceed the size of the allocated pages.
Indirect software swizzling uses a memory location, called a "resident object descriptor" to provide a level of indirection between a pointer in the depassivated object and the referenced object. Afer a depassivated object has been loaded, each persistent pointer in the depassivated object is moved to a resident object descriptor and replaced with a pointer to the resident object descriptor. Dereferencing the pointers to a resident object descriptor for the first time causes the referenced object to be loaded, and stores a main memory pointer to the loaded referenced object in the object descriptor. The main memory pointer to the loaded referenced object in the resident object descriptor is then dereferenced to provide access to the referenced object. When the pointer to a resident object descriptor is subsequently dereferenced, the main memory pointer to the loaded referenced object in the resident object descriptor is used to access the referenced object. Like hardware swizzling, indirect software swizzling has the advantage that referenced objects are not loaded until they are actually accessed. Further, referenced objects may be unloaded or relocated without invalidating pointers to the resident object descriptor stored in depassivated objects. Indirect software swizzling also has a significant disadvantage, however: dereferencing two main memory pointers (the first in the depassivated object, the second in the resident object descriptor), which is required every time the program accesses the referenced object, has double the time cost of dereferencing a single main memory pointer, or about the same amount of time it takes to execute 90 instructions on some processors.
Direct software swizzling involves replacing each persistent pointer with a main memory pointer directly to the referenced object. While this approach overcomes the extra time cost of double-indirection incurred by indirect software swizzling, it has the disadvantage that unloading or relocating referenced objects invalidates the direct main memory pointers stored in depassivated objects. This can make efforts to relocate or reclaim memory from objects that are referenced by depassivated objects difficult or impossible.
Given the disadvantages of conventional swizzling techniques, a swizzling technique having a low time cost that facilitates unloading and relocating referenced objects would have significant utility.