In many examples of networks composed of servers, computers and/or databases, an algorithm known as the two-phase commit protocol is used. This protocol is a distributed algorithm that requires that all resources in a distributed system agree to commit a transaction, before the transaction is committed. This means that the protocol will result in either all resources committing the transaction or all resources aborting the transaction, even in the case of network failures or node failures. The two phases of the algorithm are, firstly, the prepare phase, in which a transaction manager attempts to prepare all the transaction resources, and the commit phase, in which the transaction manager completes the transactions at all resources. For more detail on the operation of the two-phase commit protocol, see, for example, “Transaction Processing: Concepts and Techniques” by Jim Gray and Andreas Reuter (ISBN 1-55860-190-2).
Completing a transaction using the two-phase commit protocol is a relatively expensive exercise in terms of performance as, when more than one resource is interested in the outcome of the transaction, information concerning the resources involved and the transaction's outcome must be stored persistently, typically by writing data to disk, to ensure that the outcome of the transaction is preserved across a system failure. Furthermore, information must also be persistently stored by each resource manager involved in the transaction. Due to the expense of this logging, it is desirable, where possible, to avoid doing so.
As mentioned above, the first part of the two-phase commit protocol is the prepare phase. During prepare processing, each resource in the transaction is asked to vote on the transaction's outcome. A resource may vote to commit the transaction, vote to roll the transaction back, or vote read-only, i.e. the work done via that resource was read-only and as such it has no further interest in the transaction's outcome. When a resource is prepared and the resource votes to commit the transaction, the resource manager must store information regarding the transaction persistently such that, in the event of a failure, it ensures that it completes the transaction in the same direction as every other resource which was involved.
A relatively common scenario is one where there are two resources involved in a transaction: resource A, which has been used to perform a write, e.g., inserting a row into a table in a database; and resource B, which has only been used to perform a read. The order in which these resources are processed affects whether or not any logging is performed:
For example, if Resource A goes first:
1. Transaction commit processing begins.
2. There are two resources interested in the transaction's outcome so two-phase commit processing is required.
3. Resource A is instructed to prepare.
4. A write has been performed. The resource manager persists information about the transaction and votes to commit the transaction.
5. There are still two resources interested in the transaction's outcome, continue with prepare processing.
6. Resource B is instructed to prepare.
7. A read has been performed. No logging is required and the resource votes read-only.
8. There is one resource interested in the transaction's outcome that has been prepared. The transaction manager persists information about the resource involved and then directs it to commit.
In the reverse situation, when Resource B goes first:
1. Transaction commit processing begins.
2. There are two resources interested in the transaction's outcome so two-phase commit processing is required.
3. Resource B is instructed to prepare.
4. A read has been performed. The resource manager releases its read locks and votes read only. Note, it has not had to persist any information about the transaction.
5. Resource B has indicated that the resource is no longer interested in the transaction's outcome leaving only a single resource.
6. Resource A is instructed to perform a one-phase commit optimization.
The second scenario (when resource B goes first) significantly outperforms the first scenario, as no logging is required by either the transaction coordinator or either of the resource managers in this case, and fewer calls are made to the resource managers: one prepare and one commit call versus two prepare calls and one commit call, in the first scenario. For performance reasons it is therefore desirable for the resources to be committed in the order described in the second scenario above, however this is difficult to achieve.
Some transaction coordinators make no attempt to order the resources and typically simply commit them in the order in which they were enlisted. This approach has the disadvantage in that it offers no guarantees that the read-only resource will be processed first. Another known solution is to allow an application developer to specify an ordering priority at application development or deployment time which is used to order the resources during commit processing. This has the disadvantage that the nature of the work to be performed with each resource manager must be known in advance and must not change depending on the application's logic.
Another solution is to perform a three-phase commit where the first phase is to ask each resource for their anticipated prepare vote. While this vote may change, it can be used to then order the prepare processing such that any resources that have indicated that they anticipate responding with read only are processed first. This solution has the disadvantage that it introduces additional communication flows with the resource managers involved in the transaction.
Another improved two-phase commit is described in United States Patent Application Publication No. US 2003/0046298, which describes a transaction processing system providing an improved methodology for two-phase commit decision. In this document, a transaction processing system providing improved methodology for invoking two-phase commit protocol is described. More particularly, a transaction is handled without the use of the two-phase commit protocol, until the system determines that the transaction does, in fact, involve changes to more than one database. The methodology improves overall system performance by looking at each transaction to determine whether the transaction actually requires use of the two-phase commit protocol, before incurring the overhead associated with use of the two-phase commit protocol. Because only a small percentage of real world transactions result in updates to more than one database, the methodology improves the overall performance of transaction processing systems considerably. However, the optimization provided by a system as described in US 2003/0046298 has no value in relation to a system receiving a transaction that relates a plurality of resources. It is therefore an object of the invention to improve upon the known art.