Computer implemented transaction processing systems are used for critical business tasks in a number of industries. A transaction defines a single unit of work that must either be fully completed or fully purged without action. For example, in the case of a bank automated teller machine (ATM) from which a customer seeks to withdraw money, the actions of issuing the money, reducing the balance of money on hand in the machine and reducing the customer's bank balance must all occur or none of them must occur. Failure of one of the subordinate actions would lead to inconsistency between the records and actual occurrences.
Distributed transaction processing involves a transaction that affects resources at more than one physical or logical location. In the above example, an ATM transaction affects resources managed at the local ATM device as well as bank balances managed by a bank's main computer. A distributed transaction may not be physically distributed but may involve cooperating tasks that must be completed in synchrony for successful transaction completion.
The X/Open Company Limited (X/Open is a trademark of X/Open Company Ltd.) has promulgated a guide that describes one model for implementing distributed transaction processing. The X/Open Guide, Distributed Transaction Processing Reference Model, November 1993, ISBN 1-85912-819-9, discusses the components of a distributed transaction system and the interrelationships between them. The X/Open Distributed Transaction Processing Model (the DTP Model) describes three main components: an Application Program (AP), a Transaction Manager (TM), and one or more Resource Managers (RMs). The Application Program uses and modifies the resources controlled by one or more of the Resource Managers. The Transaction Manager is responsible for global transactions and coordinates the decision whether to commit or roll-back the actions taken by the Resource Managers. (Commit causes the resources to be updated while roll-back causes all work to be discarded returning the resources to the state they were in upon transaction initiation.) The Resource Managers manage specific resources. Resource Managers may include a database management system (DBMS), a file system, or similar resource.
Object oriented programming systems are designed to increase the efficiency of program development by enabling object reuse and simplifying system maintenance through clear separation of function. Each object in an object oriented system encapsulates the data for that object and the procedures or methods for operating on that data. Encapsulation means that the data for an object can be manipulated only by that object using the defined methods.
Object oriented systems also implement object inheritance. Inheritance allows a more specific object to be derived from a general object. The more specific object can "inherit" all of the data and methods of the parent object. It can override selected data and methods and add others to implement its unique function.
One such object oriented system is System Object Model (SOM) from IBM Corporation. In SOM, all applications using objects run in a single address space in which the objects are also located. A development within SOM is a framework of object classes called Distributed System Object Model (DSOM). In DSOM, applications (running in clients) in one address space may access objects in another address space (such as an address space belonging to a server). These address spaces may be in the same or different systems. In fact, the systems need not be running the same platform. For example, a client application running in an address space on an OS/2 system may access an object that is located in an address space on an AIX/6000 system, or vice versa. Both SOM and DSOM are described in "SOMobjects: A Practical Introduction to SOM and DSOM", published by IBM Corporation, Copyright 1994, Order no. GG24-4357-00. (SOM, OS/2, AIX and SOMobjects are trademarks of IBM Corporation).
The application of object oriented techniques to transaction processing systems raises many new issues but offers opportunities to increase system efficiency through the use of object oriented principles. The Object Management Group, Inc. (OMG) has established standards for interoperable object oriented systems. The overall architecture defined by OMG is the Object Management Architecture (OMA). A central component of OMA is the Object Request Broker that enables objects to send messages to other objects. The Common Object Request Broker Architecture (CORBA) defines the interactions between objects, and in particular, between distributed objects in different computer systems. The Object Request Broker (ORB) provides location transparency and hides the details of communication between objects. CORBA is specified in the OMG publication entitled, The Common Object Request Broker: Architecture and Specification, March 1992.
OMG has accepted a specification to standardise transaction processing in object oriented systems. This specification, entitled the Object Transaction Service (OTS) Specification, OMG document 94.8.4, sets forth the requirements for object services necessary to implement a transaction processing system. The OTS comprises a set of standard interfaces which object oriented applications may use to achieve recoverable behaviour. It also comprises a set of standard interfaces with the ORB to allow propagation of transaction context between ORB processes and to allow recovery of OTS state at server startup. The OTS specification uses many of the unique capabilities of object oriented systems. The OTS model, however, is designed to be interoperable with the X/Open DTP model and with existing procedural transaction processing systems.
The X/Open DTP model describes and many commercial transaction processing systems use what is termed a "two phase commit" to decide whether or not to commit the changes made by a transaction. The first phase involves the transaction manager determining whether each of the resource managers believes it is able to successfully commit the changes made by the transaction. If any resource manager indicates that it cannot, or fails to respond, the transaction manager causes the changes to be rolled back in each resource manager. If all of the responses are positive, then the transaction manager orders all of the resource managers to commit the transaction.
A Restart Service is a service which provides facilities to restart entities after a failure. The entities restarted and the definition of a failure of those entities varies from one service to another. A Restart Service typically has no knowledge of the state of the entities.
A Recovery Service provides facilities for the specific state of an entity to be reconstructed at a specific time. A Recovery Service typically has no way to trigger its restart after a failure. Within OTS, the Recovery Service keeps a record of the changes made to the state of recoverable objects. That record of changes is used to undo updates made in the event of a transaction rollback or to redo updates in the event of a transaction being committed. The Recovery Service also stores the global identifier for the transaction and the identities of all of the participants in a transaction. The Recovery Service stores this information persistently, with the state data being written to persistent storage at times specified by the OTS.
After a failure, when a DSOM server process is restarted, OTS is called by means of a "hook" into the DSOM startup code. OTS uses the Recovery Service to read back information about all transactions which were in-doubt prior to the failure. This information is then used to reconstruct the internal state of objects to allow the in-doubt transactions to be resolved by either being committed or by being rolled back. Starting the DSOM server process causes an instance of OTS to be created in each DSOM server process.
U.S. Pat. No. 5,151,987 discloses a method of recovering from unplanned failures in object oriented computing environments by storing recovery information in recovery objects. During recovery operations, methods present in that instance of an object use the recovery information to identify committable actions which were executed prior to the unplanned failure.
A problem with prior art recovery methods is that they do not define a general OMG system service to trigger execution of a pre-defined set of requests at startup or after a failure. The failures include hardware failures and server process failures. The pre-defined set of requests includes a defined startup/restart set of requests.
Another problem with prior art recovery methods is how to define a "specific" mechanism within the OMG/OTS that allows restart of failed processes to enable recovery to occur in a timely manner after failure. This recovery includes reconstruction of the execution state of objects associated with outstanding transactional work.