Multi-node storage systems are known as a class of data storage systems which employ a plurality of computers to store and manage data in a distributed manner. Specifically, a multi-node storage system is formed from a plurality of disk nodes and a control node which are interconnected by a network. The system provides virtual disks, or logical disks, for access to storage data physically distributed in multiple disk nodes under the control of the control node.
More specifically, a logical disk in a multi-node storage system is divided into a plurality of segments. Disk nodes, on the other hand, have their respective local storage devices, the space of which is divided into a plurality of slices. Here the slice size is equal to the segment size. The control node assigns one slice to each single segment of logical disks and informs client computers, or access nodes, of the resulting associations between the slices and segments. An access node issues a write request for a specific segment by sending write data to a disk node that manages a slice corresponding to the segment. Upon receipt, the disk node stores the received data in a relevant slice of its storage device.
The above-described multi-node storage system is scalable. That is, it is possible to expand the managed data capacity by placing additional disk nodes on the network.
The multi-node storage system may also be configured to allocate a plurality of slices to one segment. In the case of two slices per segment, one slice is designated as a primary slice, and the other slice as a secondary slice. The primary slice is a slice to which the access nodes direct their read and write requests. The secondary slice is where the primary slice is mirrored (i.e., the same write data is written in both slices). Such mirrored slice pairs ensure the redundancy of data in the system.
A disk node may encounter a failure in its storage devices or other problem. Upon detection of such anomalies, a process is invoked to restore the redundancy of stored data. For example, a redundancy restoration process first detaches the failed node from the system. Since some other disk nodes have a copy of the data stored in the failed node, the process then duplicates the copy in different nodes, thus regaining the data redundancy.
Once disconnected from the system, the disk node is no longer accessible to other nodes. For this reason, disconnection of a faulty disk node leads to permanent loss of data if it is unconditionally executed when another storage device is failed during the redundancy restoration process. Suppose, for example, that data has originally been stored in first and second storage devices. When the first storage device fails, the system initiates a redundancy restoration process. If the second storage device fails during the redundancy restoration process, the data will be lost as a result of consequent disconnection of a disk node managing the second storage device.