1. Field of the Invention
The invention relates to computer systems and computer software, and more particularly to highly available transaction management in computer systems.
2. Description of the Related Art
Some application programs, particularly business applications, may require that the results of sets of data modifying operations be committed to permanent storage atomically, that is either together or not at all, in order for the data to remain consistent and to maintain data integrity. Such a set of operations may be referred to as a transaction. An application may designate operations to be included in a transaction by including a statement to initiate a transaction, designating an identity for the transaction, and concluding the operations included in the transactions with a command to commit the database operations to permanent storage.
An example of an application in which a transaction may be beneficial is a banking application in which funds are transferred from one account to another. The application may accomplish the transfer by performing a withdrawal from one account and a deposit to another account. If the withdrawal operation completes but the deposit operation does not, the first account may reflect and improper balance. Conversely, if the deposit operation completes, but the withdrawal fails, the second account may show an improper balance. In the case of such a set of interdependent operations, neither the withdrawal nor the deposit should complete unless both can complete. By including both the withdrawal and deposit operations in a transaction, the application may designate that the operations are required to complete atomically.
In some cases, a transaction may be limited in scope to operations, which modify data in only one database on a single backend system. Such operations may be referred to as local transactions and the database or backend resource manager may itself, manage such transactions using a single-phase commit protocol. In other instances, a transaction may span multiple databases, backend systems, and/or resource managers. Sometimes a transactional message may need to be sent to another application, after the database operation. Transactions involving multiple backend systems and/or multiple front-end participants may be referred to as distributed or global transactions. Global transactions may require transaction coordination by a transaction manager external to the backend systems involved in the transaction. The transaction manager may coordinate a global transaction using a two-phase commit protocol.
At some point during the execution of a global transaction, the application may issue a request to the transaction manager to commit the transaction. Since the transaction involves multiple data resources, the transaction manager may use a two-phase commit protocol to insure transaction atomicity. Under a two-phase commit protocol, the transaction manager may query each participating data source as to whether it is prepared to commit the results of the transaction to permanent storage. The transaction manager may wait for responses from each participant, and when a full complement of affirmative responses has been received, may issue a commit request to each participant. The transaction manager may wait for “done” responses from each participant and may only mark the transaction as being completed upon receiving responses from all participants.
Since these communications may take time and failures may potentially occur in the midst of a transaction, the intermediate status of pending transactions may be logged. A log record may be generated for each “in-flight” transaction. These log record are referred to as transaction logs.
Given the amount of communications necessary to support a global transaction under a two-phase commit protocol for a large number of participants, faults and/or failures may occur in one or more of the participants that may effect the commitment of the transaction. A robust two-phase commit protocol may allow for recovery from most participant failures, but may not provide for recovery in the case of failure of the transaction manager. A transaction manager may be coordinating several transactions at any given time. Some of the transactions in flight at the moment of failure may be in an indeterminate state, depending upon the phase of the transactions when the transaction manager failure occurs. Inability to recover and revert the states of the transactions in all participants may result in application hangs due to locked resources and other problems.