A “storage cluster” environment has one or more hosts connected to two primary (or protection) storage systems clustered as shown in FIG. 1. For simplicity, referring to FIG. 1, assuming host 101 (with multi-pathing) is connected to two storage systems 102-103. Typically, clustered storage systems 102-103 may be connected to host 101 in an active/active configuration. In such active/active configuration, host 101 sends input and output (10) requests to both storage systems and data flows between the storage systems 102-103. If one system fails, the host works with the surviving system without disruption.
Alternatively, the storage systems can be configured in an active/passive configuration. In such active/passive configuration, host 101 sends the IO requests to the active storage system 102, which updates the passive system 103. When active system 102 fails, host 101 may crash and a user has to switch the passive system 103 to be active and reboot host 101, which causes interruption.
Furthermore, the storage systems can be configured in an active/hot-standby (HS) configuration. In such active/HS configuration, host 101 sends IO requests to active system 102, which updates HS system 103. Host 101 is also connected to HS system 103, whose storage devices are ready and discovered by host 101, but would not service the commands from host 101. When active system 102 fails, HS system 103 must verify that active system 102 is down and then quickly, with no customer impact, becomes active. If HS system 103 becomes active concurrently with the original active system 102, it will cause errors. There has been a lack of efficient way to verify whether the original active storage system is actually down without a third-party witness.