With the development of information technologies, the Internet can provide a large amount of diversified information for users. For network service providers, distributed systems are often required to process large amounts of data, because conventional processing devices, such as a single server, processor, or database, are incapable of satisfying necessary computational requirements for data processing.
Generally, multiple processing devices included in a distributed system may globally perform multiple types of management operations on various data resources in the distributed system, such as scheduling, distributed processing, cooperative processing, and remote control. Each processing device in the distributed system may be regarded as a distributed node (or “node”). In this way, data can be distributed to each node in the system for processing to improve data processing efficiency and increase data throughput.
In the distributed system, it is necessary to uniformly store some special types of information in a node to further improve efficiency of scheduling data resources in different nodes. For example, metadata describing data resources can include a storage location of data, an update state, a search keyword, and other information consistent with the present disclosure.
In order to provide continuous and reliable read and write service with unreliable but often cost-effective components, data replication is often widely used. In order to prevent corruption by multiple modifiers of the same data, a single leader is often elected from the corresponding replicas. This single leader acts as the representative of the corresponding replicas and does all reads and writes on behalf of them. In order to prevent long-time interruptions of the service, the leader has a tenure. In order to provide smooth and continuous service, the leader will renew its tenure by obtaining permission from a majority of the corresponding replicas before the expiration of its current tenure. If a majority of the replicas are active, the leader will always extend its tenure successfully unless the leader encounters some error (such as a network error, a software error, or a hardware error) or if the leader receives an abdication command from a human or non-human administrator.
Automatically electing a leader from a group of members without any intervention of human being is not easy, as no member is absolutely reliable and no member can act as an election coordinator. Besides, when a member cannot communicate with a leader, the leader may still work correctly.
Previous solutions for electing a leader can be as follows. When a member finds that the tenure of the current leader expires and the member receives no declaration from a new leader, the member will launch a new round of leader election by sending an election message to all other members. If the other members receive the election message and also find that the tenure of the current leader has expired, the other members will vote for the initiator, and the initiator will become the new leader. But if two or more members initiate the election simultaneously, the election will often fail. In this scenario, a determination of who will become the new leader is unknown and there is no way to set different election priorities among members.