A cluster is typically used to provide a very high degree of availability for computing services. A cluster is typically comprised of several nodes among which “quorum” must exist. Quorum is a concept that is employed to enforce one, and only one, official cluster membership of nodes. Restricting quorum to only one collection of cluster nodes prevents a cluster from partitioning into multiple collections of nodes, each operating without the knowledge of the others. The danger is that these disjoint collections may result in unsynchronized access to cluster data and services and lead to data corruption.
A cluster is said to “have quorum” when there are sufficient cluster nodes that have the same view of the current state of the cluster validated by being able to communicate among one another. From the perspective of an application or an end user, quorum must be maintained in order for the application to function properly. If the cluster loses quorum, the cluster will typically seek to re-establish quorum and, if unable, shut down, terminating the applications under its control.
One common method of establishing quorum is to ensure that a simple majority (i.e., at least one more than 50%) of cluster member nodes are able to communicate with each other. Since there can be only one simple majority in a cluster, quorum ownership by one and only one group of cluster member nodes is guaranteed. Other methods may also be used to establish and maintain quorum.
It is increasingly common to geographically distribute the nodes of a cluster over long distances in an effort to minimize the loss of cluster services as a result of catastrophic failures, such as large-scale/long-term power failures, natural disasters such as earthquake or flood, and the like. For example, a company may establish a cluster providing critical computing services and physically locate one portion of the cluster nodes on the United States east coast, another portion on the west coast, and yet another portion in the central states. Such a geographically distributed cluster tends to minimize loss of availability of cluster services even in the event of significant disasters. Deploying the nodes of such a geographically distributed cluster, as well as the communication pathways between the nodes, can be expensive.