1. Field of the Invention
The present invention relates generally to a system and method for managing a dual-level data storage system in a transaction processing system. More particularly, the present invention relates to synchronization of a volatile data cache with a persistent data store. Still more particularly, the invention relates to the management of data in a distributed object oriented transaction processing system where object data is maintained in a volatile cache and periodically committed to a persistent data store.
2. Background and Related Art
Computer implemented transaction processing systems are used for critical business tasks in a number of industries. A transaction defines a single unit of work that must be either fully completed or fully purged without action. For example, in the case of a bank automated teller machine(ATM) from which a customer seeks to withdraw money, the actions of issuing the money, reducing the balance of money on hand in the machine and reducing the customer's bank balance must all occur or none of them must occur. Failure of one of the subordinate actions would lead to inconsistency between the records and actual occurrences.
Distributed transaction processing involves a transaction that affects resources at more than one physical or logical location. In the above example, an ATM transaction affects resources managed at the local ATM device as well as bank balances managed by a bank's main computer. A distributed transaction may not be physically distributed but may involve cooperating tasks that must be completed in synchrony for successful transaction completion.
The X/Open Company Limited has promulgated a guide that describes one model for implementing distributed transaction processing. The X/Open Guide, Distributed Transaction Processing Reference Model, October 1991, discusses the components of a distributed transaction system and the interrelationships between them. The X/Open Distributed Transaction Processing Model (the DTP Model) describes three main components: an Application Program(AP), a Transaction Manager (TM), and one or more Resource Managers (RMs). The Application Program uses and modifies the resources controlled by one or more of the Resource Managers. The Transaction Manager is responsible for global transactions and coordinates the decision whether to commit or rollback the actions taken by the Resource Managers. (Commit causes the resources to be updated while rollback causes all work to be discarded returning the resources to the state they were in upon transaction initiation.) The Resource Managers manage specific resources. Resource managers may include a database management system (DBMS), a file system, or similar resource.
Object oriented programming systems are designed to increase the efficiency of program development by enabling object reuse and simplifying system maintenance through clear separation of function. Each object in an object oriented system encapsulates the data for that object and the procedures or methods for operating on that data. Encapsulation means that the data for an object can be manipulated only by that object using the defined methods. Object oriented systems also implement object inheritance. Inheritance allows a more specific object to be derived from a general object. The more specific object can "inherit" all of the data and methods of the parent object, but can override selected data and methods and add others to implement its unique function.
The application of object oriented techniques to transaction processing systems raises many new issues but offers opportunities to increase system efficiency through the use of object oriented principles. The Object Management Group, Inc. (OMG) has established standards for interoperable object oriented systems. The overall architecture defined by OMG is the Common Object Request Broker Architecture (CORBA). CORBA defines the interactions between objects, and in particular, between distributed objects in different computer systems. OMG has accepted submission of a proposal to standardize transaction processing in object oriented systems. This submission, entitled the Object Transaction Service(OTS), sets forth the requirements for object services necessary to implement a transaction processing system. The OTS specification uses many of the unique capabilities of object oriented systems. The OTS model, however, is designed to be interoperable with the X/Open DTP model and with existing procedural transaction processing systems.
The X/Open DTP model describes and many commercial transaction processing systems use a "two phase commit" to decide whether or not to commit the changes made by a transaction. The first phase involves the transaction manager determining whether each of the resource managers believes it is able to successfully commit the changes made by the transaction. If any resource manager indicates that it cannot, or fails to respond, the transaction manager causes the changes to be rolled back in each resource manager. If all of the responses are positive, then the transaction manager orders all of the resource managers to commit the transaction.
Object oriented systems frequently are implemented using a dual storage model for storing data. The dual storage model uses a first level of persistent storage, such as hard disk, non-volatile memory, or read/write CD-ROM to maintain permanent data. The permanent data in a transaction processing system is maintained by a resource manager such as a database management system(DBMS.) A second level of volatile data storage exists in association with each of the objects. This second level, volatile data is more rapidly accessed by the application through the available object methods. The data at this second level is frequently referred to as "cached data."
Object data that is added or changed in the system by changing the cached data, must eventually be migrated to persistent storage. Data migration can occur through various algorithms such as periodic flushing of the data cache or explicit program instruction to write the data to persistent storage. Flushing the data to persistent storage consumes considerable system resource and efficient system design requires that object data cache flushes be minimized.
Distributed transaction processing systems, such as that discussed in the present application, present unique problems to a dual storage system. Flushing of the data to the persistent storage results in an update of that storage. The persistent data storage frequently is managed in a transaction processing system by a resource manager as discussed above. The resource manager controls updates to data including the commit/rollback processing. Once data for a particular transaction has completed termination ("commit" or "rollback") processing for that resource manager, the resource manager will not accept additional data from that transaction for updating the persistent store. If volatile data remains in an object data store, that data will be lost because normal cache migration to persistent storage will be refused by the resource manager.
Procedural transaction processing systems (those that are not object-oriented) have addressed this problem of coordinating changes to underlying data stores during transaction completion. The solution for procedural systems is significantly easier than that required for object oriented systems due to the static nature of the procedural system commit tree. The commit tree is used by the system during the two-phase commit process. Prepare and commit messages are sent from the transaction manager and transmitted to each element of the tree following the tree hierarchy. The commit tree in a procedural system is always built with the persistent data storage at the bottom of the tree. This static commit tree ensures that the commit requests are received by upper level resources (volatile storage) before they are received by the underlying persistent resource manager. This allows the cache to flush the contents of volatile storage into the persistent resource manager when the transaction is ended, because the cache is guaranteed to be committed before the resource manager.
Procedural transaction processing systems also tend to have the transaction manager, the cached data, and the underlying resource manager co-located on the same physical computer system. Co-location has allowed for the alternate cache synchronization solution of notifying all data storage "objects" of the impending data commitment or rollback so that each can flush cached data. This approach is much less satisfactory, however, where a distributed transaction processing system is implemented. Warning all objects in a distributed transaction processing system is not practical due to the communication costs associated with sending each distributed object a message and the resultant loss of performance.
Distributed object-oriented systems pose special problems not found in traditional procedural transaction processing systems. The dynamic structure of the commit tree in object based systems contributes to these special problems. The objects involved in a transaction, and hence needing to be in the commit tree, change dynamically based on the flow of messages for a particular transaction instance. While this dynamic structure of objects is desirable since it provides tremendous programming flexibility, it does not guarantee a commit structure that has the underlying persistent data store (resource manager) at the bottom of the tree. By failing to ensure this relationship, the situation could exist where the persistent data store receives a commit or rollback request before a volatile object that contains data meant to be flushed to this persistent data store. In this situation, the cache cannot be flushed into the persistent resource manager as the resource manager considers the transaction complete.
An alternate solution to the dual-level storage problem is to use a single-level storage model instead. The IBM OS/400 operating system and the Object Design Inc. system implement single level stores. In these systems the storage is managed so that writing to volatile storage is guaranteed to result in writing to persistent store regardless of transaction activity.
The technical problem therefore exists to implement a distributed transaction processing system that is able to efficiently ensure that all cached object data is flushed to persistent storage before transactions are committed.
The technical problem includes a need to implement a distributed transaction processing system that causes object data cache warning messages to be sent only to those objects that have volatile data that could be affected by the transaction, and, that causes exactly one cache synchronization message to be sent to each distributed node that contains one or more objects requiring data synchronization. The cache synchronization message flows must complete before commit messages are sent. Responsibility for flushing data must reside with the objects maintaining the data rather than with the client application.