Computing systems host software that runs modern economies and societies—affecting the real time day-to-day safety and welfare of millions of people. The continuous availability of many computer systems is essential to a society's functioning, health, and security. Sporadic unavailability, even for minutes, of some systems is dangerous and disruptive: air traffic control and financial systems are examples. Therefore, much thought and effort has been applied to enhancing the availability of some systems—and to decreasing downtime in the event of a failure.
The decreasing cost and increasing performance of computer system hardware over time has channeled efforts to increase system availability toward a model in which system components are replicated and dispersed geographically to avoid natural disasters and to avoid any one failure from causing a total system failure. In this model, software running on replicated controllers synchronizes and coordinates data movement and data replication among the replicated components to avoid total failure scenarios and to facilitate expeditious system recovery.