As background arts of the technical field, EP1344133A1, U.S. Pat. No. 5,812,748 and U.S. Pat. No. 7,478,263B1 are known, for example.
EP1344133A1 discloses a method for increasing the availability of a first server included in a computer cluster when a second server fails. Each server in the computer cluster has an associated mass storage device and can process requests from any network device in the computer cluster. Data is mirrored between the mass storage devices of the servers so that each server's mass storage device has a complete copy of all computer cluster data. Data mirroring takes place across a dedicated link. When the first server detects a loss of communication from the second server, the first server determines if the loss of communication is a result of a malfunction of the dedicated link. If the dedicated link has failed, the first server discontinues operation to avoid writing data to its associated mass storage device, which can not be mirrored due to the loss of communication. If the dedicated link is operational, the first server continues operation.
U.S. Pat. No. 5,812,748 discloses a method for improving recovery performance from hardware and software errors in a fault-tolerant computer system. A backup computer system runs a special mass storage access program that communicates with a mass storage emulator program on the network file server, making the disks on the backup computer system appear like they were disks on the file server computer. By mirroring data by writing to both the mass storage of the file server and through the mass storage emulator and mass storage access program to the disks on the backup computer, a copy of the data on the file server computer is made.
U.S. Pat. No. 7,478,263B1 discloses a system and method for permitting bi-directional failover in two node clusters utilizing quorum-based data replication. In response to detecting an error in its partner, the surviving node establishes itself as the primary of the cluster and sets a first persistent state in its local unit. A temporary epsilon value for quorum voting purposes is then assigned to the surviving node, which causes it to be in quorum. A second persistent state is stored in the local unit and the surviving node comes online as a result of being in quorum.