In High Availability (HA) stateful, or non-stateful, systems which are configured with redundant peer processors, the “Active-Standby” or “Active-Active” models are used. In the “Active-Standby” model, one processor provides service (as the Active) while the second redundant processor, the Standby, waits to assume control should the Active fail (or be requested to switchover). In the “Active-Active” model, both processors, either located physically in the same system or in a physically separate system, provide service simultaneously (i.e., both are Active) while each acts as the Standby for Active work on the peer unit.
In the Active-Active model the “Default Active” unit is defined as the peer processor which acts as the Active unit for Active-Standby model applications, i.e., it preserves the Active-Standby model for that set of features and functions that continue to use the Active-Standby model in an Active-Active system. The “Default Standby” unit is defined as the unit that continues to play the Standby role for Active-Standby features and functions in an Active-Active system. Not all HA-enabled applications in an Active-Active system must implement the Active-Active model
Peer processors in such systems must be connected via a communication channel (referred to as the “interconnect”). This interconnect can be “soft” or “hard”—i.e., it can be either a software communication channel or a hard wire that provides the communication channel. Since detection of a failure of the peer is critical, this interconnect is used as a channel to send regular “heartbeat”/“keepalive” signals in each direction so that failures can be quickly detected and a switchover to the remaining operational unit can be quickly effected.
In a stateful system, the interconnect is also used to send state data from an Active instance to a Standby instance. This keeps the Standby instance synchronized with the state of the Active instance so that the Standby instance can take over without service interruption should the Active instance fail.
When a failure of the peer is detected, a switchover is begun by the unit that detects the failure. After switchover, the unit that detected the failure assumes the Active role for all of the Active work previously being performed by the failed peer (this operation is called a “switchover”).