RAID (Redundant Array of Inexpensive Disks) storage systems have emerged as an alternative to large, expensive disk drives for use within present and future computer system architectures. A RAID storage system includes an array of small, inexpensive hard disk drives, such as the 51/4 or 31/2 inch disk drives currently used in personal computers and workstations. Although disk array products have been available for several years, significant improvements in the reliability and performance of small disk drives and a decline in the cost of such drives have resulted in the recent enhanced interest in RAID systems.
Current disk array design alternatives are described in an article titled "A Case for Redundant Arrays of Inexpensive Disks (RAID)" by David A. Patterson, Garth Gibson and Randy H. Katz; University of California Report No. UCB/CSD 87/391, December 1987. The article, incorporated herein by reference, discusses disk arrays and the improvements in performance, reliability, power consumption and scalability that disk arrays provide in comparison to single large magnetic disks. Five disk array arrangements, referred to as RAID levels, are described. The simplest array, a RAID level 1 system, comprises one or more disks for storing data and an equal number of additional "mirror" disks for storing copies of the information written to the data disks. The remaining RAID levels, identified as RAID level 2, 3, 4 and 5 systems, segment the data into portions for storage across several data disks. One or more additional disks are utilized to store error check or parity information. The present invention is primarly directed to improvements in the operation of RAID level 3, 4 and 5 systems.
A RAID level 3 disk array comprises N+1 disks wherein N disks are used to store data, and the additional disk is utilized to store parity information. During RAID level 3 write functions, each block of data is divided into N portions and stripped across the N data disks. The corresponding parity information, calculated by performing a bit-wise exclusive-OR of corresponding portions of the data stripped across the N data drives, is written to the dedicated parity disk. Write operations therefore involve all N+1 drives within the array. When data is read, only the N data disks must be accessed. The parity disk is used to reconstruct information in the event of a disk failure.
A RAID level 3 system including five drives is shown in FIG. 1. The disk drives are labeled DRIVE A through DRIVE E. Data is striped across disks DRIVE A through DRIVE D, each data disk receiving a portion of the data being saved. Data striping may occur at either the byte or word level. Parity information, generated through a bit-wise exclusive-OR of the data stored on drives DRIVE A through DRIVE D, is saved on drive DRIVE E. Also shown is a sixth, spare disk drive, labeled DRIVE F, which is included in the array as a replacement for any of disks DRIVE A through DRIVE D should one fail. An array controller 100 coordinates the transfer of data between the host system 147 and the array disk drives. The controller also calculates and checks parity information. Blocks 145A through 145E illustrate the manner in which data bytes and parity information are stored on the five array drives. Data bytes are identified with hexadecimal numerals 00 through FF. Parity bytes are identified as PARITY 0 through PARITY 3.
A RAID level 4 disk array is also comprised of N+1 disks wherein N disks are used to store data, and the additional disk is utilized to store parity information. However, data to be saved is divided into larger portions, consisting of one or more blocks of data, for storage among the disks. Writes typically require access to two disks, i.e., one of the N data disks and the parity disk. Read operations typically need only access a single one of the N data disks, unless the data to be read exceeds the block length stored on each disk. As with RAID level 3 systems, the parity disk is used to reconstruct information in the event of a disk failure.
RAID level 5 disk arrays are similar to RAID level 4 systems except that parity information, in addition to the data, is distributed across the N+1 disks in each group. Each one of the N+1 disks within the array includes some blocks for storing data and some blocks for storing parity information. Where parity infomation is stored is controlled by an algorithm implemented by the user. As in RAID level 4 systems, RAID level 5 writes typically require access to two disks; however, no longer does every write to the array require access to the same dedicated parity disk, as in RAID level 4 systems. This feature provides the opportunity to perform concurrent write operations.
A RAID level 5 system including five data and parity disk drives, DRIVE A through DRIVE E, and a spare disk drive, DRIVE F, is illustrated in FIG. 2. Array controller 100 coordinates the transfer of data between the host system 147 and the array disk drives. The controller also calculates and checks parity information. Blocks 145A through 145E illustrate the manner in which data and parity is stored on the five array drives. Data blocks are identified as BLOCK 0 through BLOCK 15. Parity blocks are identified as PARITY 0 through PARITY 3.
Although many advantages are presented by RAID storage systems in comparison with single disk storage systems, such as increased storage capacity, higher data transfer rates and reduced reliability overhead costs; RAID systems are not without possible performance detriments. Particularly, input/output rates may be reduced and the speed of write operations may decrease significantly, especially for RAID level 4 or 5 systems.
For each of the RAID designs discussed in Patterson, an array write encompasses two or more individual disk write operations. Each individual write operation involves a seek and rotation to the appropriate disk track and sector to be read. The seek time for all disks, i.e., the seek time for the array write, is therefore the maximum of the seek times of each disk.
A typical RAID level 4 or 5 device employs a read-modify-write (RMW) process for writing new data and parity information to the array drives. A read-modify-write process includes the steps of (a) reading old data and old parity information from the disk drives containing the old data and parity information, (b) generating new parity information from the old data, new data received by the disk array, and the old parity information., and (c) writing the new data and new parity information to the disk drives. Thus, an array write operation will require a minimum of two disk reads and two disk writes. More than two disk reads and writes are required for data write operations involving more than one data block. A RAID level 4 or 5 system thus carries a significant write penalty when compared with a single disk storage device or with RAID level 1, 2 or 3 systems.