The invention relates to computer systems for persistently storing data.
Computers typically store information persistently on a persistent storage medium such as a hard disk. Such a medium is called xe2x80x9cpersistentxe2x80x9d because the data stored on it do not change unless and until they are explicitly modified by the computer. To modify persistently stored data, computers typically transfer the data from the persistent storage medium to a transient storage medium, such as a Random Access Memory (RAM). After changes are made to the data in the transient storage medium, the computer may replace the original data on the persistent storage medium with the modified data.
Application programs use a variety of mechanisms to store data persistently. Database systems, for example, typically provide interfaces that allow application programs to store, retrieve, and modify information in databases maintained by the database system. A database system typically has an interface specially designed for communicating with the database system. Different database systems typically provide different interfaces. Applications may also store information persistently in flat files using an interface provided by a file systems.
Object-oriented application programs typically model a problem domain using an xe2x80x9cobject modelxe2x80x9d that defines classes of objects representing elements of the problem domain. A class definition defines the class in terms of (1) the relationship of the class to other classes, (2) the data associated with objects in the class, (3) the operations that can be performed on objects in the class. During execution of an object-oriented application program, instances of the classes in the object model, referred to as xe2x80x9cobjects,xe2x80x9d are produced and manipulated. For example, software used for biotechnology research might model individual genomes, genes, markers, chromosomes, genotypes, and alleles as objects of different classes. It is often desirable to persistently store representations of such objects.
The invention provides a persistence architecture that allows application programs to transparently access multiple persistent storage mechanisms through a single interface. The persistence architecture may be used, for example, by object-oriented application programs to persistently store objects. To carry out persistent storage transactions (e.g., store, retrieve, and modify), application programs make calls to methods provided by the persistence architecture rather than to routines provided by the interfaces of the underlying persistent storage mechanisms. Prior to running an object-oriented application program which uses a particular object model, the persistence architecture is configured to map object classes in the object model to particular persistent storage mechanisms. When the application program executes and makes calls to persistence architecture methods, the persistence architecture carries out the necessary transactions with the appropriate persistent storage mechanisms. In this way, the application program remains independent of the underlying persistent storage mechanisms used to store the application""s objects.
According to an aspect of the invention, a method for processing a storage mechanism-independent query includes identifying at least one persistent storage mechanism, from among at least two persistent storage mechanisms, that is capable of providing data for satisfying the query and deriving, from the identified persistent storage mechanisms, data satisfying the query.
According to an additional aspect of the invention, a method for reflecting a change to the state of a storage-mechanism independent data structure in persistent storage media comprises identifying at least one persistent storage mechanism, from among at least two persistent storage mechanisms for storing information on the persistent storage media, that is designated for reflecting the state of the data structure and engaging in physical transactions with the identified persistent storage mechanisms to reflect the change to the state of the data structure.
According to an additional aspect of the invention, a computer program product residing on a computer readable medium for processing a storage mechanism-independent query, the computer program product comprising instructions for causing a computer to identify at least one persistent storage mechanism, from among at least two persistent storage mechanisms, that is capable of providing data for satisfying the query and derive, from the identified persistent storage mechanisms, data satisfying the query.
The invention provides several advantages. One advantage is that the persistence architecture de-couples an application""s object model from the persistent storage mechanisms used to persistently store objects in the object model. A number of benefits result from this de-coupling. Development of object models can take place independently of the establishment of mappings between objects in the object models and persistent storage mechanisms. As a result, application programmers can design object models without knowing which persistent storage mechanisms will be used to store objects in the object model. Mappings between object classes and persistence storage mechanisms may, for example, be established after object models have been designed. Furthermore, the tasks of object model design and persistent storage mapping can be assigned to different programmers or organizational units.
Similarly, an established mapping between object classes and persistent storage mechanisms can be changed without requiring changes to be made to object models or to the application programs that deploy them. Because mappings may be changed at runtime, changes to mappings need not require re-compilation or re-linking of application programs.
Another advantage of the persistence architecture is that it allows objects to be distributed across persistent storage mechanisms in a way that is transparent to application programmers. For example, different object classes in an object model may be persistently stored using different database systems. Similarly, data contained in a single object may be spread across multiple database systems. The application programmer implements persistent storage capabilities in an application using the interface provided by the persistence architecture, without regard to the way in which objects and object classes are distributed among persistence mechanisms. As a result, the distribution of objects and object classes can change without requiring changes to application programs using the persistence architecture.
The ability to distribute data across multiple persistent storage mechanisms and to change the distribution of data is particularly advantageous when different persistent storage mechanisms are best-suited for different kinds of data or for different applications. In such cases, the persistence architecture may be configured to persistently store each kind of data using an optimal persistent storage mechanism. This allows applications to take advantage of the strengths of different persistent storage mechanisms without being tightly coupled to the particular interfaces provided by the different persistent storage mechanisms.
By providing a common interface between application objects and multiple persistent storage mechanisms, the persistence architecture reduces overall application development time by allowing application developers to focus on designing object models, rather than on the details of persistent storage. Furthermore, the persistence architecture reduces training time because it does not require application programmers to be familiar with multiple persistent storage mechanism interfaces. If the persistence architecture is used for all persistent storage within an application or within a suite of applications, application programmers need only learn the protocol of the persistence architecture, regardless of which or how many underlying persistent storage mechanisms are used for persistent storage. Specific object-to-storage medium mappings are left to the persistence architecture and are hidden from the application programmer. After an application programmer has learned how to use the persistence architecture to persistently store objects, the programmer can use the same knowledge to incorporate persistent storage capabilities into many applications without additional training.
Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
The techniques described below may be implemented in computer hardware or software, or a combination of the two. However, the techniques are not limited to any particular hardware or software configuration; they may find applicability in any computing or processing environment that may be used for persistent storage of data. Preferably, the techniques are implemented in computer programs executing on programmable computers that eachinclude a processor, a storage medium readable by the processor (including volatile and non-volatile memory and/or storage elements), at least one input device, and one or more output devices. Program code is applied to data entered using the input device to perform the functions described and to generate output information. The output information is applied to the one or more output devices.
The techniques described below are preferably implemented in a high level procedural or object oriented programming language to communicate with a computer system. However, the programs can be implemented in assembly or machine language, if desired. In any case, the language may be a compiled or interpreted language.
Each such computer program is preferably stored on a storage medium or device (e.g., CD-ROM, hard disk or magnetic diskette) that is readable by a general or special purpose programmable computer for configuring and operating the computer when the storage medium or device is read by the computer to perform the procedures described in this document. The system may also be considered to be implemented as a computer-readable storage medium, configured with a computer program, where the storage medium so configured causes a computer to operate in a specific and predefined manner.