There is an increasing need for large-scale storage systems to maintain a large amount of data. Such systems must be highly reliable and available. Traditionally, fault tolerance has been the solution for reliable storage systems. Systems are guaranteed to operate correctly as long as the assumptions required by those mechanisms are satisfied. The assumptions often include a bound on the number of failures in the system. Such a system fails when those assumptions are invalidated; for example, when the system experiences excessive and correlated failures.
Reliable distributed systems are typically designed to be fault tolerant. Fault tolerance mechanisms ensure system correctness, but only with respect to a system model that specifies the type and extent of failures. Most of the time, the system exists in a normal state, with no faulty components or by tolerating the failures of a few components. However, systems may suffer excessive failures that go beyond what is allowed in the system model. In these cases, fault tolerance mechanisms enter an abnormal state, are unable to mask failures, and cause reliable systems to fail.
Thus, a typical reliable distributed system is available in normal state, but becomes completely unavailable after transitioning to an abnormal state. For example, a system that adopts the replicated state machine approach tolerates a minority of machine failures, but cannot tolerate when a majority of machines becomes unavailable.
It might seem that with sufficient replication the probability of a system entering an abnormal state may be virtually eliminated. However, this assumes that failures are independent. Such an assumption is often invalidated in practice (e.g., due to subtle software bugs, a certain batch of disks being defective, or accidental triggering of the power off button). Furthermore, conventionally tolerating an excessive number of failures through replication requires fault tolerant mechanisms to incur prohibitively high resource cost (e.g., in terms of I/O, CPU, and storage) and significantly reduces system performance, thereby making it an impractical choice.