SAN systems are primarily used to enhance the accessibility and availability of the data preserved in storage devices, such as disk arrays, tape libraries and optical jukeboxes, to the servers collaborating with the storage devices so that the storage devices appear to be locally attached devices to the servers or the operating system(s) within the servers in an enterprise situation. Therefore, a SAN system typically has its own network of storage devices that are generally not accessible through the local area network (LAN) by other devices. Because of the lowered cost and complexity of SAN systems, SAN systems are widely employed from the enterprise level to small businesses in the early 2000s.
A basic SAN system includes three major components: a SAN switch, a plurality of storage devices and at least one server. High-speed cables with the fiber channel (FC) technology are used to connect the various components together. In most real-world situations, a SAN system includes many different switches, storage devices and servers, and it may further include routers, bridges and gateways to extend the function of the SAN system. Therefore, the topology of a SAN system depends on its size and purpose, and the complexity of the topology of SAN systems has evolved as time goes by.
The disclosed SAN system includes a number of servers coupled with a number of storage devices via a number of SAN switches, wherein an availability device connects to the SAN switches, such that the availability device can communicate through the SAN switches to manage the various routes between the servers and storage devices. Through this management, the accessibility and availability between the servers and storage devices are implemented. An availability device includes a number of special purpose devices, called “availability engines”, which are clustered together to manage the storage devices mounted on the SAN system.
An event can occur in a situation where the FC connections between two clustered engines are broken, while both engines can still detect their respective local FC nodes, for example, servers and storage devices located at its own site. The event is called an isolation of FC connectivity or “FC isolation”.
A high availability (HA) engine is designed to take over the engine cluster in the event of a disastrous situation where the remote site goes down. An FC isolation scenario will cause a situation where a local engine cannot detect a remote engine at the other site, therefore the local engine considers the remote engine at the other site is down. As a result of the FC isolation, both engines continue operation with their local FC nodes, i.e., data synchronization between the local site and the remote site will diverge. This condition is called “split-brain”.
When a split-brain condition occurs, each engine will consider the local mirror at the same site as active, while the mirror members at the other site as missing. With each site continuing to operate under the assumption that the other site is down, mirror members will begin to diverge due to ongoing local-host write, through local FC switch, to local storage. Thus, the mirror members' content will become inconsistent with each other. This can be dangerous and undesirable. In addition, the effects of these problems usually won't become obvious immediately. Host IO at each site will continue to run as if normally, without any indication that the data is “split-brained”.
Therefore, there is an expectation that a method of site isolation protection, electronic device and system using the same method to solve split-brain problems between different sites.