1. Field of the Invention
The present invention relates to a RAID system, in which data are stored with redundancy in a plurality of disk device and in which when a part of the disk devices fails the redundant configuration is reconstructed by Rebuilding and copying back with data from other disk devices, and relates to RAID controller, and to Rebuild/Copy back processing method thereof, and more particularly to a RAID system for decreasing the Rebuild and Copy back processing time, and RAID controller and Rebuild/Copy back processing method thereof.
2. Description of the Related Art
As various data is being computerized and handled on computers, the importance of data storage devices (external storage devices), which can store large volumes of data efficiently with high reliability independently from the host computer that can execute data processing, is increasing today.
As such a data storage device, a disk array device has many disk devices (e.g. magnetic disk devices, optical disk devices) and disk controller for controlling these many disk devices are being used. This disk array device implements the redundancy of data by using a RAID (Redundant Array of Inexpensive Disk) configuration so as to improve reliability.
In such a disk array device, if a disk device constituting a RAID group fails and loses redundancy, the redundancy must be recovered. FIG. 14 is a diagram depicting the rebuild function of RAID 5 for such redundancy recovery. For active maintenance, a spare disk device HS (Hot Spare Disk) is installed to the four disk devices #0, #1, #2 and #3 constituting RAID 5.
This disk device group 160 is connected to a pair of disk controllers 110. Each disk controller 110 has a disk adapter 140 for controlling interface with the disk device group 160, a control unit 120 and a channel adapter 100 for controlling interface with the host (not illustrated).
If the disk device #0 of the four disk devices constituting RAID 5 failed, the data of the disk device #1, #2 and #3, other than this failed disk device #0, is read to the cache memory or data buffer (not illustrated) of the control unit 120 via the disk adapter 140, and the XOR operation is performed to create the redundant data.
And the created redundant data is written to the spare disk device HS via the disk adapter 140 to recover redundancy. This is called the “Rebuild function”. In the case of RAID 1, the data which was read is directly written to the spare disk device HS.
FIG. 15, on the other hand, is a diagram depicting the Copy back function. When the failed disk device #0 is replaced with a new disk device New in a status where Rebuild in FIG. 14 completed, Copy back processing is performed. In other words, the redundant data written in the spare disk device HS is written back to the new disk device New.
In order to execute this Rebuild/Copy back processing while processing I/O requests from the host, a method for balancing the number of these requests was proposed (e.g. Japanese Patent Application Laid-Open No. 2004-295860).
As FIG. 16 shows, in the case of Rebuild/Copy back processing, write processing to the disk drive #2 at the write destination is executed after read processing from the original disk drive #1 is completed, even if a normal I/O (host I/O, internal I/O) is absent.
In other words, if the OVSM module for managing Rebuilt/Copy back processing requests one time of Copy back to the RAID control module, which executes RAID access, the RAID control module completes reading of the disk drive #1, then executes writing to the disk drive #2. And the RAID control module receives the write completion from the disk drive #2, reports the completion of the requested Copy back to the OVSM module, and the OVSM module executes the next Copy back request processing.
Therefore to perform Rebuild/Copy back processing while accepting a normal I/O processing, processing for one unit of a disk device cannot be performed all at once. So the operation to-read data from the normal disk device and write it to the write destination disk device is executed in parts for a number of times to recover the data volume of the disk device.
In the case of Rebuild, for example, data is read from a normal disk device, redundant data is created unless this is RAID 1 (mirroring), and the data is written to the spare disk device HS or a new disk device New, and in the case of Copy back, the data is read from the spare disk HS and written to the new disk device New.
Conventionally for these operations, read and write are recognized as integrated, and read and write are handled as one unit, therefore unless one time of read and write processing ends, the next read and write processing cannot be started.
Therefore if a normal I/O is absent, it takes time until completion of Rebuild/Copy back compared with the normal performance of a disk device. Recently the time until the completion of Rebuild/Copy back is becoming longer due to the increase of data base capacity and the increase of the storage capacity of disk devices.
In some cases the operation of Rebuild/Copy back may be requested to the CE (Customer Engineer) of the system manufacturer, which means that the CE must be present at the equipment installation location until these processings complete, so the processing wait time for the CE increases, and decreasing such processing time is demanded.