1. Field of the Invention
The present invention relates to a RAID system that stores data redundantly to a plurality of disk device and performs a Rebuild/Copy back processing for rebuilding/copying back the data, in which, when a part of the disk devices fails, a redundant configuration is reconstructed using data from another disk device, and to a rebuild/copy back processing method thereof, and more particularly to a RAID system and a Rebuild/Copy back processing method thereof for rebuilding/copying back the data with receiving the host I/O.
2. Description of the Related Art
Along with the recent computerization of various data which is handled by computers, data storage devices (external storage devices), which can store large volumes of data efficiently with high reliability independently from the host computer for executing the processing of data, are becoming more and more important.
As such data storage devices, a disk array device having many disk devices (e.g. magnetic disk devices, optical disk devices) and a disk controller for controlling these many disk devices is used. This disk array device implements the redundancy of data by using a RAID configuration so as to improve reliability.
In this disk array device, if a disk device constituting the RAID group fails and loses redundancy, redundancy recovery is required. FIG. 8 is a diagram depicting the rebuild function of RAID 5 for recovering this redundancy. For active maintenance, a spare disk device HS (Hot Spare Disk) is installed in addition to the four disk devices #0, #1, #2 and #3 constituting the RAID 5.
This disk device group 160 is connected to a pair of disk controllers 110 and 120. Each disk controller 110/120 has a disk adapter 140 for controlling the interface with the disk device group 160, a control unit 120, and a channel adapter 100 for controlling the interface with the host (not illustrated).
If the disk device #0, out of the four disk devices constituting the RAID 5, fails, the data of the disk devices #1, #2 and #3, other than the failed disk device #0, is read to the cache memory (not illustrated) of the control unit 120 via the disk adapter 140, and XOR operation of these is performed to create the redundant data.
And through the disk adapter 140, the created redundant data is written to the spare disk device HS to recover redundancy. This is called the “rebuild function”.
FIG. 9 is a diagram depicting the copy back function. If the failed disk device #0 is replaced with a new disk device New in a state where Rebuild in FIG. 8 has completed, Copy back processing is performed. In other words, the redundant data written to the spare disk device HS is rewritten to the disk device New.
It is desirable to execute such rebuild/copy back processing while processing an I/O request from the host, and a method to balance the number of these requests has been proposed (e.g. Japanese Patent Application Laid-Open No. 2004-295860).
In order to perform Rebuild/Copy back processing while processing I/O requests from the host, the entire processing for one unit of the disk device cannot be executed all at once. So the processing size to be executed at one time is fixed, and data is read from the normal disk device for this fixed processing size, and is written to the write destination disk device, and this operation is executed for the number of times for completing the data volume of the disk device.
In the case of rebuild, for example, data is read from a normal disk device, redundant data is created other than for RAID 1 (mirroring), and is written to the spare disk device HS or a new disk device New, and in the case of copy back, data is read from the spare disk HS and written to the new disk device New.
Conventionally the processing size of each time is fixed for these operations, and is not changed depending on the load status of the RAID group. So in the case of a system which adjusts for balancing the disk access by a normal I/O (access from host) and disk access by Rebuild/Copy back when a normal I/O is present, that is when a load is being applied, the performance of the disk device cannot be fully expressed when Rebuild/Copy back processing is executed in the status where a normal I/O is absent.
Therefore when a normal I/O is absent, the time until Rebuild/Copy back completes becomes longer compared with the performance of the disk device which is expected as a matter of course. Recently the storage capacities of disk devices are increasing, so decreasing the time until Rebuild/Copy back completes is becoming a critical issue.