Many storage networks may implement data replication and/or other redundancy data access techniques for data loss protection and non-disruptive client access. For example, a first storage cluster may comprise a first storage controller configured to provide clients with primary access to data stored within a first storage device and/or other storage devices. A second storage cluster may comprise a second storage controller configured to provide clients with access to data stored within a second storage device (e.g., failover access to replicated data within the second storage device) and/or other storage devices (e.g., primary access to data stored within a third storage device). The first storage controller and the second storage controller may be configured according to a disaster recovery relationship, such that the second storage controller may provide failover access to replicated data that was replicated from the first storage device to the second storage device (e.g., a switchover operation may be performed where the second storage controller assumes ownership of the second storage device and/or other storage devices previously owned by the first storage controller so that the second storage controller may provide clients with failover access to data within such storage devices).
In an example, the second storage cluster may be located at a remote site to the first storage cluster (e.g., storage clusters may be located in different buildings, cities, thousands of kilometers from one another, etc.). Thus, if a disaster occurs at a site of a storage cluster, then a surviving storage cluster may remain unaffected by the disaster (e.g., a power outage of a building hosting the first storage cluster may not affect a second building hosting the second storage cluster in a different city).
If the first storage cluster merely comprises the first storage controller and the second storage cluster merely comprises the second storage controller (e.g., single storage controller cluster configurations that may be cost effective due to clusters merely comprising single storage controllers), then there may not be local high availability storage controllers paired with the first storage controller or the second storage controller that could otherwise provide relatively fast local takeover for a failed storage controller for non-disruptive client access to data of the failed storage controller (e.g., if the first storage cluster comprised a third storage controller having a high availability pairing with the first storage controller, then the third storage controller could quickly perform a local takeover for the first storage controller in the event the first storage controller fails). Instead, a cross-cluster switchover operation may need to be performed if a storage controller fails. Cross-cluster remote detection of a storage controller failure (e.g., the second storage controller, within the second storage cluster, detecting a failure of the first storage controller within the first storage cluster) may utilize timeouts, manual switchover, and/or other relatively slow or imprecise techniques that may not provide adequate non-disruptive client access to data (e.g., a client may lose access to data for more than 2 minutes while waiting on a manual switchover from a failed storage controller to a surviving storage controller). Thus, it may be advantageous to quickly and reliably detect storage controller failure cross-cluster for automatic implementation of switchover operations.