Many Internet and cloud computing applications have scaling requirements for transactional workloads that exceed the capabilities of enterprise databases. Various methods of structuring a database may be used to meet the scaling requirements. One conventional method includes sharding on a cluster of commodity servers, in which each node in the cluster is responsible for a shard (or part) of the data and runs its own independent instance of database software. Other partitioned database architectures have emerged that automate sharding and load balancing across nodes to make sharding easier. These architectures typically use key-based hash or range partitioning to assign data to nodes in the cluster.
In addition to aggressive scaling requirements, many Internet and cloud computing applications also need to be continuously available. However, on a large cluster of commodity servers with hundreds or thousands of nodes, failures are inevitable. Consequently, replication protocols may be used to allow high availability and fault tolerance in the database. Such strategies tend to experience tradeoffs among consistency and availability.