Building fault-tolerant distributed systems may require saving system state consistently across geographically diverse locations. For example, by providing database replicas of the system state in different geographic locations, such as the East Coast and West Coast of the United States, a failure of one system in a particular geographic region may not affect the entire system.
These distributed systems may require a consistency mechanism to maintain the integrity of the data generated by the system. In one example, these systems may operate using the Paxos algorithm. Paxos is a consensus algorithm which may be executed by a set of servers to enable them to agree on a single value in the presence of failures (consensus). It may guarantee synchronous replication of submitted values among the majority of servers. Paxos may require consensus at every update or write operation. During such consensus, a master server needs to talk to all other peer servers and obtain a majority count from them. Accordingly, the master server initiates roundtrip communications with all peers at every requested operation and will wait for the responses before proceeding. However, by doing so, consensus may have to come from more than one geographical region before a transaction can be committed. This can significantly increase the cost to the network, increase the latency of the system, and reduce throughput (return of success or failure on each requested operation) of the system.
For example, based on typical network statistics, the RTT (round trip time) between different geographical regions, such as East Coast and West Coast of the United States, may frequently go as high as 100 milliseconds. In such an event, this may limit the throughput of such systems to approximately 10 transactions per second, and may be ineffective for any systems which may need to process (both read and write) tens of thousands of events or even millions per second.
One solution in a distributed system is to generate batched transactions. However, in most client-server models, decisions for batching and the responsibility of maintaining consistency within a batch lies with client rather than the server. In this example, the server may simply support a batch of transactions, but the client has to specify all the details of a batch.