With the rapid development of information technologies and the wide application of the Internet, data generated by people increases in an explosive manner, which imposes a higher requirement on extensibility of data storage. Compared with a conventional storage array system, a distributed storage system has better extensibility and common hardware device compatibility, and can better meet a requirement for data storage in the future.
In the distributed storage system, generally, a large quantity of storage nodes are organized to form a distributed system, and data reliability is ensured by means of data replication and backup between different nodes, so that data has replicas on all the different storage nodes. How to ensure data consistency of multiple data replicas has become a problem confronting the distributed storage system for a long time. In a case of ensuring the data consistency, system performance and availability also become considerations of increasing importance.
FIG. 1 shows the existing two-phase commit protocol (2 Phase Commit, 2PC), which is a typical centralized strong-consistency replica control protocol and is used in many distributed database systems to ensure replica consistency.
In the two-phase commit protocol, a system generally includes two types of nodes: a coordinator (coordinator) and a participant (participant). The coordinator is responsible for execution of initiating voting on data updating and notifying a voting decision, and the participant participates in the voting on the data updating and executes the voting decision.
The two-phase commit protocol includes two phases: Phase 1 is a commit-request phase, where the coordinator instructs the participant to vote on data modification, and the participant notifies the coordinator of its own voting result: Yes or No; and Phase 2 is a commit phase, where the coordinator makes a decision: Commit or Abort according to the voting result in the first phase.
Successfully executing the two-phase commit protocol once requires at least two rounds of interaction between the coordinator and each participant with four messages, and excessive times of interaction degrade performance. In addition, in the two-phase commit protocol, if a node becomes faulty or continuously has no response, another input/output (“IO”) request is blocked and finally fails due to a timeout, and a data rollback needs to be performed. Thus, the two-phase commit protocol has relatively low fault tolerance and availability.