When a fault occurs in a disk apparatus of a redundant storage system such as a redundant array of independent disks (RAID-3, -5), the system disconnects the disk apparatus where the fault has occurred. When the storage system disconnects the disk apparatus, the storage system records and updates data using the remaining disk apparatuses excluding the disconnected disk apparatus.
If another fault subsequently occurs at another disk apparatus, the storage system disconnects the disk apparatus where the fault occurred. In the following description, this state will be referred to as a “multi-dead state” where a fault occurs at a disk apparatus in the storage system and consequently, the storage system loses its redundancy and thereafter, another disk apparatus is further disconnected.
A fault that causes the disconnection of the disk apparatus can be, for example, thermal off-tracking, contamination, noise, or poor electrical contact. The fault of a disk apparatus such as thermal off-tracking, contamination, noise, and poor electrical contact is often restored by resetting the corresponding hardware or resupplying power after suspending the power (turning off and on the power).
By executing a resetting of the hardware or by turning off and on the power for the disk apparatus that has been disconnected due to the occurrence of a fault, the disk apparatus can be restored as a disk apparatus that operates normally. Therefore, when a storage system in a multi-dead state is restored, for example, the hardware of the storage system is reset and thereby, the state is restored to the state maintained before the storage system entered the multi-dead state.
Documents disclosing techniques to restore a storage system having a disk apparatus that has failed include, for example, Japanese Laid-Open Patent Publication Nos. H11-95933, 2005-78430, and 2010-26812.
However, according to the conventional techniques, a problem arises in that, even if the state of a storage system is restored to the state maintained before the storage system entered the multi-dead state by resetting the hardware, etc. to restore the storage system, data corruption may occur due to inconsistency of data among the disk apparatuses.