Replication is a popular technique adopted in distributed systems to improve system reliability (i.e., availability and durability). One technique for building replicated services is state machine replication, in which a deterministic service is replicated on multiple nodes. This replication in space ensures that the failure of a subset of the nodes on which the service is replicated does not render the service inaccessible.
When a state machine is replicated, the distributed system must ensure the consistency of the replicas with respect to state updates. One approach is to use a consensus protocol to ensure that replicas are mutually consistent. Consensus protocols include: 2-phase commit, Paxos, and the Chandra-Toueg algorithm.
Different consensus protocols have different scalability and availability properties. However, in all of the protocols, reaching a consensus becomes more difficult as the number of replicas increases. One reason for the difficulty is the increased likelihood, as the number of replicas increases, that a failure will affect one or more replicas at the time a consensus is being formed. The failure then prevents consensus decisions.
While the system and components thereof are susceptible to various modifications and alternative forms, specific embodiments thereof are shown by way of example in the drawings and will herein be described in detail. It should be understood, however, that the drawings and detailed description thereto are not intended to limit to the particular form disclosed, but on the contrary, all modifications, coverage of equivalents and alternatives falling within the spirit and scope of the appended claims is specifically intended.