Byzantine fault tolerance (BFT) refers to the ability of a computing system to endure arbitrary (i.e., Byzantine) failures that would otherwise prevent the system's components from reaching consensus on decisions critical to the system's operation. In the context of state machine replication (i.e., a scenario where a system provides a replicated service whose operations and state are mirrored across multiple nodes, known as replicas), BFT protocols are used to ensure that non-faulty replicas are able to agree on a common order of execution for client-initiated service operations. This, in turn, ensures that the non-faulty replicas will run in an identical and thus consistent manner.
One well-known BFT protocol that is used in the state machine replication context is Practical BFT (PBFT) (see Castro et al., “Practical Byzantine Fault Tolerance,” available at http://pmg.csail.mit.edu/papers/osdi99.pdf, incorporated herein by reference for all purposes). Generally speaking PBFT and its variants operate according to a sequence of “views,” which can be understood as phases in the protocol's determination of a single consensus decision. In each view, one replica, referred as a proposer, sends a proposal for a decision value (e.g., operation sequence number) to the other replicas and attempts to get 2f+1 replicas to agree upon the proposal, where f is the maximum number of replicas that may be faulty. If this succeeds, the proposal becomes a consensus decision (i.e., a decision that is deemed to be agreed upon by a consensus of the replicas). However, if this does not succeed (due to, e.g., a proposer failure), the replicas enter a “view-change” procedure that causes a new, subsequent view to be entered/initiated. In the subsequent view, a new proposer is selected and the new proposer transmits a new proposal comprising votes received from replicas in the prior view, and the process above is repeated until a consensus decision is reached.
Unfortunately, the transmission of the new proposal incurs a relatively high communication bit complexity of n3, where n corresponds to the total number of replicas. In addition, the view-change procedure can recur O(n) times due to a cascade of up to f proposer failures. Thus, in conventional PBFT, the total amount of bits that may need to be transmitted as part of one or more view-changes before a single consensus decision is reached is O(n4), which can pose significant scalability issues for even moderate system sizes (e.g., n=100).