The information systems of a modern day enterprise (such as a corporation or government institution) are often responsible for managing and performing automated tasks upon large amounts of data. Persistent data is that data that “exists” for extended periods of time (i.e., “it persists”). Persistent data is typically stored in a database so that it can be accessed as needed over the course of its existence. Here, complex “database software” (e.g., such as DB2, Oracle, and SQL Server) is often used to actually read the data and perhaps perform various intelligent functions with it. Frequently, persistent data can change over the course of its existence (e.g., by executing a series of reads and writes to the data over the course of its existence). Moreover, multiple items of different persistent data may change as part of a single large scale “distributed transaction”.
A distributed transaction is a transaction that involves more than one database or server. Distributed transactions frequently involve multiple databases accessed through multiple servers that are interconnected by a network. Because of the use of multiple databases, distributed transactions are an attempt at some sort of comprehensive function that serves the enterprise's needs. For example, in the case of an airline, a single distributed transaction might be used to manage an internet connection to a potential customer who may reserve a particular seat on a particular flight. Here, note that a number of different databases may be involved in a single distributed transaction that is executed for the customer's experience with the airline's on-line ticketing and reservation system.
For example, assume the distributed transaction is expected to: 1) provide the potential customer with flight scheduling, pricing and seating information; 2) record the customer's name, address, credit card, and email information if any flight is reserved by the customer; 3) update the seating information for each seat reserved by the customer; 4) update the customer's frequent flier mileage records if the customer is registered in the airline's frequent flier program; 5) update the airline's accounting records to reflect the new revenue introduced by each flight reservation made by the customer; and, 6) invoice the customer using the customer's credit card information.
Here, a number of different databases may be involved in the distributed transaction such as: 1) a first database that keeps track of the airline's flight scheduling information; 2) a second database that keeps track of information specific to a particular flight such as seating information; 3) a third database that keeps track of flight pricing information; 4) a fourth flight that keeps track of each customer's name, address and email information; 5) a fifth database that keeps track of each frequent flier's mileage; 6) a sixth database that keeps track of the airline's accounting records; and 7) a seventh database that keeps track of the airline's invoicing records.
FIGS. 1a and 1b depict how a distributed transaction is typically carried out by an enterprise's information system infrastructure. A protocol, referred to as the “two-phase commit” protocol is used to ensure that either a distributed transaction's database updates are successfully completed in their entirety; or, the distributed transaction is not effected at all. By ensuring that database updates for a distributed transaction are either completely carried out or not carried out at all, incorrect database records are avoided (e.g., a seat being reserved for a reservation that is not actually made, a seat not being reserved for a reservation that is actually made, etc.). FIG. 1a corresponds to a two-phase commit protocol in which all of distributed transaction's database updates are recorded. FIG. 1b corresponds to a two-phase commit protocol in which none of distributed transaction's database updates are recorded.
The example of FIG. 1a shows four servers 1011 through 1014, each coupled to its own corresponding database 1021 through 1024; where, each of the databases is to be updated with new information upon completion of the distributed transaction's various calculations. That is, first a distributed transaction performs its various tasks and calculations with the data that it uses; then, upon completion of these tasks and calculations, the distributed transaction's databases are updated with any updates needed to be made to their respective data as a consequence of the distributed transaction's full execution.
Each server 1011 through 1014 includes its own resource manager module 1031 through 1034 that is responsible for communicating with a particular database. The resource manager can often be viewed as driver software that is used to send specific functional commands to the database software in response to requests/commands made by higher level software functions. The commands sent to a database are typically scripted in some form of database language (e.g., Structured Query Language (SQL)). Examples of resource managers include a Java Database Connectivity (JDBC) driver that is presently part of the J2EE platform and an Open Database Connectivity (ODBC) driver provided by Microsoft Corporation.
A transaction manager module 104 is responsible for, typically among other responsibilities, implementing the two-phase commit protocol with those resource managers that communicate to a database that is to be updated after a distributed transaction's calculations have been executed. In the examples of FIGS. 1a and 1b, each of databases 1021 through 1024 are assumed to require some portion of their data to be changed as a consequence of the distributed transaction's completed execution. The transaction manger 104 therefore coordinates a sequence of messaging exchanges between itself and resource managers 1031 through 1034. Examples of transaction manager modules include the API and logic behind the Java Transaction API (JTA) that is part of the J2EE platform and the Microsoft Distributed Transaction Coordinator (MDTC) from Microsoft Corporation. A high level exemplary review of the messaging used to implement a two phase commit protocol immediately follows.
Once a distributed transaction's calculations are completed so that all database changes to be made as a consequence of the transaction's execution are known (e.g., entry of a specific reserved seat on a specific flight, etc.), the first phase of the two-phase commit protocol begins with the transaction manager 104 receiving a “commit” command 1 from another portion of the distributed transaction's software (e.g., “a client” or “container” that executes higher level functions of the distributed transaction). In response to the received “commit” command 1, the transaction manager 104 sends “prepare” commands 2 to each of the resource managers 1031 through 1034. Note that, because a network 105 resides between the server 1011 that contains the transaction manager 104 and servers 1012 through 1014, those of the “prepare” commands 2 that are sent to servers 1012 through 1014 pass through network 105.
In response to the received “prepare” commands 2, each resource manager forwards a “prepare” command 3 to its corresponding database in the appropriate language format (e.g., SQL). Each database 1021 through 1024 performs what is akin to a “soft write” of the new, updated information. That is, for example, each database runs through all internal routines just short of actually writing the new, updated information. If a problem is not detected by a database (e.g., an incompatibility in the data) just short of the actual write of the updated information, a database reports a “ready” response. In FIG. 1a, each database reports a “ready” response 4; and, in FIG. 1b, databases 1021 through 1023 report a “ready” response while database 1024 reports a “rollback” response 11.
A “rollback” response means that a database has recognized some problem in preparing itself to actually write its updated information. As a consequence, a “rollback” response essentially means that the new information cannot be written. Given that all new information of distributed transaction must be written or no new information from a distributed transaction may be written, as shall become evident in more detail immediately below, the “ready” response of each server in FIG. 1a results in all of the new information being written to each server; while, the single “rollback” response 11 in FIG. 1b results in no new information being written to any server. The situation of FIG. 1a therefore corresponds to a situation in which the distributed transaction “takes effect”; while, the situation in FIG. 1b corresponds to the distributed transaction as not being recognized as ever having being executed.
In FIGS. 1a and 1b, the responses of each of the databases 1021 through 1024 (e.g., “ready” responses 5 in FIG. 1a) are forwarded to the transaction manager by the resource managers 1031 through 1034. The reception of these responses by the transaction manager 104 marks the end of the first phase of the two-phase commit protocol.
The transaction manager's sending of a second set of messages in response to the received responses marks the beginning of the second phase. Because the transaction manager 104 receives all “ready” responses from the resource managers 1031 through 1034 in the situation of FIG. 1a, the transaction manager responds with the sending of a set of “commit” messages 6 to the resource managers 1031 through 1034. The resource managers 1031 through 1034 forward 7 the “commit” command to their respective databases 1021 through 1024 which, in turn, causes the prepared data to be actually written into each. The databases confirm that the data updates have been successfully written by sending a “committed” response 8 to their corresponding resource managers. The resource managers then forward 9 these messages to the transaction manager 104. The transaction manager 104 then responds to the original commit command 1 with a committed response 10. At this point all databases to be updated with new information are updated and the second phase of the two-phase commit protocol is complete.
In FIG. 1b, the reception of the rollback message from server 1014 by the transaction manager 104 causes the transaction manager 104 to send rollback messages 12 to each of the resource managers 1031 through 1034 and to inform the higher level software that the new data could not be committed 13. These rollback messages 11 are then effectively forwarded 14 to the databases 1021 through 1024; which, in turn, causes each of servers 1021 through 1023 to cancel their prepared writes of new information. As such, no new information is written into any of the databases 1021 through 1024.