The two-phase commit (2PC) protocol is a known technique for coordinating updates in a computer networking environment. FIG. 1A illustrates a system 100 for implementing the 2PC protocol, according to the prior art. A system that implements the 2PC protocol can include N nodes, each of which includes a representation of the same data. In the example in FIG. 1A, two nodes are represented as servers 102A, 102B. Each server 102A, 102B has access to a separate data store 106A, 106B, respectively, that includes a representation of the same data.
The data accessible by each server 102A, 102B is updated so that the data remains consistent, regardless of any node or network failures in the system 100. To achieve this result, the 2PC protocol implements a 2PC coordinator 104. As shown in FIG. 1A, the 2PC coordinator 104 is a separate entity from the servers 102A, 102B and is connected to the servers 102A, 102B with a network (not shown).
When implementing the 2PC protocol, the 2PC coordinator 104 receives an instruction to perform a transaction from one of the servers 102A, 102B. In response, the 2PC coordinator 104 initiates the “prepare” phase of the 2PC protocol. During the prepare phase, the 2PC coordinator 104 transmits a message to each server that includes the details of the transaction. When a server receives the “prepare” message, the server evaluates the proposed change and replies back to the 2PC coordinator 104 with a message that indicates whether the server agrees to perform the transaction. This reply message is referred to as a “vote.”
Once the 2PC coordinator 104 receives replies from all of the servers, the 2PC coordinator 104 decides whether to proceed with the transaction. If all servers voted “YES,” (i.e., agreed to the transaction), then the 2PC coordinator 104 initiates the “commit” phase of the 2PC protocol. During the commit phase, the 2PC coordinator 104 transmits a message to each server requesting the server to modify the data according to the transaction. When a server receives the “commit” message, the server updates its copy of the data. According to the 2PC protocol, once the server receives the commit message, the server is required to perform the transaction (i.e., the server is not allowed to fail and the transaction is guaranteed).
If, however, one or more servers did not agree to the transaction and voted “NO” during the prepare phase, then the 2PC coordinator 104 initiates the “abort” phase of the 2PC protocol. During the abort phase, the 2PC coordinator 104 transmits a message to each server that voted “YES” to the transaction, instructing the server to discard whatever temporary information the server has stored during the “prepare” phase. Servers that voted “NO” to the transaction during the “prepare” phase do not receive the “abort” message. When a server receives the “abort” message, the server discards whatever temporary information the server stored during the “prepare” phase.
However, there are several problems associated with the 2PC protocol. For example, it is a known problem that if the 2PC coordinator 104 crashes or is otherwise unavailable after the “prepare” phase has started, then the transaction will never be committed or rolled back. It is customary for servers to lock the shared data in the “prepare” phase and release the locks on the data during the “commit” or “abort” phase. These locks are used to guarantee that the data is not changed after the “prepare” phase has started in such a way that the “commit” phase may fail. If the 2PC coordinator never completes the transaction, these locks on the data may never be released, essentially blocking any further updates to the shared data.
FIG. 1B is a timing diagram illustrating one problem with the two-phase commit protocol, according to the prior art. As shown, time t increases from left to right. At time t1, a server receives a “prepare” message from the 2PC coordinator. At time t2, the server locks the data 110 associated with the prepare message. At time t3, the server sends a “YES” vote to the 2PC coordinator indicating that the server is able to complete the transaction. At this point, the server is waiting to receive either a “commit” message or an “abort” message from the 2PC coordinator. However, assume that at time t4, another server crashes and never returns a vote to the 2PC coordinator. The server that responded YES continues to wait indefinitely for a message from the 2PC coordinator, and the data 110 remains locked indefinitely.
To overcome these drawbacks, some prior techniques implement a timeout procedure to release locks on data. If one of the servers does not reply with a vote in response to the prepare message within the timeout period, the 2PC coordinator presumes the server has failed and aborts the transaction. However, this technique works only if the 2PC coordinator is still available. If a server needs to abort a transaction without the 2PC coordinator, the server itself could implement a timeout procedure of its own, but this could lead to severe data inconsistencies between servers.
Accordingly, there remains a need for a technique for breaking locks held by two-phase commit transactions while preserving data consistency that overcomes the drawbacks discussed above.