1. Field of the Invention
The present invention relates to a RAID control apparatus, RAID control program and a RAID control method, which perform recovery of data when a trouble occurs in RAID (Redundant Arrays of Inexpensive Disks) apparatuses.
2. Description of the Related Art
FIG. 4 is a block diagram showing the configuration of a conventional RAID apparatus. This RAID apparatus comprises RAID control apparatuses 111a and 111b, discs drives 21a, 21b, 21c, 21d and 21e. The RAID control apparatuses 111a and 111b control the disks 21a, 21b, 21c, 21d and 21e. Thus, the disks 21a, 21b, 21c, 21d and 21e constitute a RAID group of RAID 5.
In the RAID apparatus, two or more of the disks that constitute the RAID group may make errors (or the RAID apparatus may assume multi dead/multi unmount state). In this case, the RAID redundancy configuration is destroyed. Consequently, the RAID apparatus can no longer recover data.
A conventional multi-dead/multi-unmount recovery method will be described. FIG. 5 is a table that shows an order in which the disks are recovered in this multi-dead/multi-unmount recovery method. The table shows the serial numbers of the disks, the order in which the disks come to have errors, the order in which the disks are recovered, and the methods for recovering data in the respective disks. In the multi-dead/multi-unmount recovery method, the disks (of the RAID group) having errors are incorporated into the system, in the order reverse to the order they have come to have errors. Thus, the disk that has come to have errors last is incorporated into the system first, and the disk that has come to have errors first is incorporated into the system last. The disk that has come to errors first is replaced last and is then recovered from the errors in rebuild process.
This multi-dead/multi-unmount recovery method can recover the RAID apparatus to the state that the apparatus had immediately before the process of writing data in any disk stopped.
Jpn. Pat. Appln. Laid-Open Publication No. 8-249130 discloses a prior-art technique that is relevant to the present invention. The publication discloses a trouble-detecting system. In the trouble-detecting system, the first controller makes a request for access to the memory managed by the second controller, when makes access to the memory it manages. If the first controller receives no access permission, a trouble will be considered to have occurred in the second controller.
The multi-dead/multi-unmount recovery method described above works well only if the multi dead/multi unmount state is not one resulting from troubles in the RAID control apparatus (for example, a multi dead/multi unmount state resulting from a trouble on the FC (Fiber Channel) loop).
In the multi dead/multi unmount state resulting from any trouble in the RAID control apparatus, the RAID configuration can be recovered by the conventional multi-dead/multi-unmount recovery method. Nevertheless, the RAID configuration will probably have a similar trouble right after it is so recovered. Not only much time will be required to recover the system, but also the data will most likely change or will be lost while the system is being recovered.
A multi dead/multi unmount state may result from troubles in the RAID control apparatus if a trouble develops in the device that controls the disk-mounting or a signal line provided in the RAID control apparatus. Consequently, the system cannot locate any disks. In this case, the LEDs on both the disk drives and the RAID control apparatus may be turned on and emit light, informing the user of the error. However, the user cannot determine whether the errors have resulted from any troubles in the RAID control apparatus.