Modern computer systems typically include a CPU to process data, a networking interface to communicate with other computer systems, and one or more durable storage units. The system may stop processing, for example, due to power failure, program incorrectness, or a hardware fault. Such failures are often called process failures. The durable storage units are able to keep the data intact while the fault is repaired.
A set of these computer systems can be networked to form a cluster. Although the network is generally reliable, occasional faults may occur to disrupt communication between certain nodes or sets of nodes. This disruption in communication is often called a network partition.
Each of these nodes runs a transactional storage system that both reads and writes data (a database management system). Some of this data is concurrently accessed by applications operating on different nodes. To guarantee data consistency, database replication techniques are used to manage and regulate access to that data. However, such conventional replication techniques are associated with a number of tradeoffs and problems.
For example, in traditional replication systems, there is a tradeoff between data consistency and fault tolerance. In more detail, replication systems that provide high data consistency tend to exhibit low fault tolerance. Likewise, replication systems that provide high fault tolerance tend to exhibit low data consistency. In addition, theoretical transactional fault tolerant replication systems require significant changes to existing database management systems.
What is needed, therefore, are database replication techniques that provide high data consistency and fault tolerance, and that have the flexibility to be applied to both existing and new database systems, existing and new applications, and configurability to achieve various data consistency levels with different performance and fault tolerance characteristics.