RAID is standardized technology for the storage of data with emphasis on performance, fault tolerance, and the ability to recover data due to a failure of a disk drive. Many RAID products are commercially available. The RAID Advisory Board of St. Peter, Minn. has defined and standardized several different RAID levels. RAID level 1 (“RAID 1”), for example, is a mirrored disk wherein a complete copy of the data on one disk is simultaneously maintained and stored on a second disk. In the event of a failure of one disk, a complete copy of the data on the second disk is available. The data on the second disk may be used to recreate the data on the first disk when the first disk is replaced or repaired. RAID level 5 (“RAID 5”) uses several disks to store data. The data is stored in stripes, meaning that for a large block of data, some may be written to the first drive, some to the second drive, and so forth. Several disks may write in parallel, thus increasing the data throughput by a multiple of the number of available disks. RAID 5 uses parity as a method to store redundancy information. Parity is computed by performing the exclusive OR (XOR) function to the data on each block of the stripe. Other RAID levels exist with different variations of performance and cost tradeoffs.
In a RAID device, a logical drive is made up of multiple stripes and a stripe is made up of multiple stripe units wherein each stripe unit is located on a unique physical storage device such as a disk or the like. When a single physical storage device goes defunct and stripe units of data cannot be read from that device, the data may be reconstructed using the stripe units of the remaining physical devices. A stripe is reconstructed by reading all stripe units in a stripe except the failed stripe unit and doing an exclusive OR (XOR) operation on the data. In the case of a disk rebuild operation, this data may be written to a new replacement device designated by the end user. When a logical drive rebuild is performed, each stripe unit is reconstructed until all stripes within that logical drive have been rebuilt.
It is important that a drive group be restored back to full redundancy as soon as possible after a drive failure, because a second drive failure may cause the drive group to become dead with complete loss of data. Conventionally, reduced rebuild time is accomplished by using larger IO (input/output) sizes or by managing multiple concurrent rebuild IOs. However, either of these solutions reduces the amount of memory available for other IO processing, and managing multiple concurrent rebuild IOs tends to be very complex.
Thus, it would be desirable to provide a method for reducing rebuild time on a Redundant Array of Independent Disks (RAID) device.