A distributed system is a multi-node system in which data is stored in various databases. Nodes can be any data processing system, such as a computer system. Although each database can only be accessed through one node, more than one database may be accessible through a node in the distributed system. The nodes in a distributed system can be connected to one another through a network, such as a local area network (LAN) or a wide area network (WAN). In addition, nodes in a distributed system may be in one location or spread out over multiple locations. Examples of distributed systems include database systems, mail server systems, etc.
Since a transaction, which consists of a set of requests that results in a single logical action, can modify data on multiple databases in a distributed system, the distributed system must ensure that data consistency is maintained, regardless of whether or not failures (e.g., power outages, hardware crashes, etc.) occur. Hence, each requested operation in a transaction must be “committed,” i.e., changes to data become persistent, before the transaction can be committed. A data change becomes persistent when a log record of the data change is “flushed,” i.e., written, to non-volatile storage (e.g., disk drive). Log records allow a node to restore a database to its pre-failure state by replaying the operations that committed prior to failure.
Traditionally, distributed systems have utilized a two-phase commit (2PC) protocol to preserve consistency of data. In a 2PC system, a coordinator node for each transaction, i.e., the node where a client (e.g., an application) submitted the transaction, identifies, for each request in the transaction, a node in the distributed system responsible for handling the request. Each node assigned to handle a request in the transaction is referred to as a participant node.
Each participant node in a two-phase commit protocol votes whether to commit or abort the transaction and sends its vote to the coordinator node. The coordinator node then makes the final decision on whether to commit or abort the transaction based on the vote from each participant node. A transaction will only be committed by the coordinator node if all of the participant nodes vote to commit the transaction. Otherwise, the coordinator node will abort the transaction.
The two-phase commit protocol, however, is not really message efficient because during phase one, the coordinator node sends a message to each participant node to prepare to commit the transaction. Each participant node then decides whether it can commit the requested operation(s) and sends a message back to the coordinator node with its vote on whether to commit or abort the transaction. In the second phase, the coordinator node decides whether to commit or abort the transaction based on all of the votes it received from the participant nodes and sends a message to each participant node to commit or abort the transaction.
Another commit protocol employed by distributed systems is a two-interval commit (2IC), discussed in U.S. Pat. No. 5,799,305, entitled “Method of Commitment in a Distributed Database Transaction,” which is hereby incorporated in its entirety for all purposes. The 2IC system uses interval messages that are sent in succession from an interval coordinator to determine whether to commit or abort a transaction. Thus, although a 2IC system requires less messaging than a 2PC system, it is still more message-intensive than necessary.
Accordingly, there is a need for a distributed transaction commitment protocol that is more message efficient than current commitment protocols. The present invention addresses such a need.