RAID (Redundant Array of Inexpensive Disks) storage systems have emerged as an alternative to large, expensive disk drives for use within present and future computer system architectures. A RAID storage system includes an array of small, inexpensive hard disk drives, such as the 5 1/4 or 3 1/2 inch disk drives currently used in personal computers and workstations. Although disk array products have been available for several years, significant improvements in the reliability and performance of small disk drives and a decline in the cost of such drives have resulted in the recent enhanced interest in RAID systems.
Current disk array design alternatives are described in an article titled "A Case for Redundant Arrays of Inexpensive Disks (RAID)" by David A. Patterson, Garth Gibson and Randy H. Katz; University of California Report No. UCB/CSD 87/391, December 1987. The article, incorporated herein by reference, discusses disk arrays and the improvements in performance, reliability, power consumption and scalability that disk arrays provide in comparison to single large magnetic disks. Five disk array arrangements, referred to as RAID levels, are described. The simplest array, a RAID level 1 system, comprises one or more disks for storing data and an equal number of additional "mirror" disks for storing copies of the information written to the data disks. The remaining RAID levels, identified as RAID level 2, 3, 4 and 5 systems, segment the data into portions for storage across several data disks. One or more additional disks are utilized to store error check or parity information. The present invention is directed primarily to improvements in the operation of RAID levels 1, 4 and 5 systems. RAID level 3 systems may also benefit from application of the present invention.
A RAID level 4 disk array is comprised of N+1 disks wherein N disks are used to store data, and the additional disk is utilized to store parity information. Data to be saved is divided into portions consisting of one or many blocks of data for storage among the disks. The corresponding parity information, which can be calculated by performing a bit-wise exclusive-OR of corresponding portions of the data stored across the N data drives, is written to the dedicated parity disk. The parity disk is used to reconstruct information in the event of a disk failure. Writes typically require access to two disks, i.e., one of the N data disks and the parity disk, as will be discussed in greater detail below. Read operations typically need only access a single one of the N data disks, unless the data to be read exceeds the block length stored on each disk.
RAID level 5 disk arrays are similar to RAID level 4 systems except that parity information, in addition to the data, is distributed across the N+1 disks in each group. Each one of the N+1 disks within the array includes some blocks for storing data and some blocks for storing parity information. Where parity information is stored is controlled by an algorithm implemented by the user. As in RAID level 4 systems, RAID level 5 writes typically require access to two disks; however, no longer does every write to the array require access to the same dedicated parity disk, as in RAID level 4 systems. This feature provides the opportunity to perform concurrent write operations.
A RAID level 5 system including five disk drives, DRIVE 1 through DRIVE 5, is illustrated in FIG. 1. Five data groups, identified as Data Group 0 through Data Group 4, are shown, each data group providing for the storage of four data blocks and one parity block. Data blocks, numbered 0 through 19, are identified by reference numeral 103. Parity blocks are identified by reference numeral 107.
Although the discussion presented above, including the references incorporated by reference, refer to arrays of disk drives, arrays may also be constructed of storage units other than disk drives, such as tape drives. A RAID level 4 system including five tape drives, DRIVE 1 through DRIVE 5, is illustrated in FIG. 2. As in FIG. 1, five data groups, identified as Data Group 0 through Data Group 4, are shown, each data group providing for the storage of four data blocks and one parity block. Data blocks, numbered 0 through 19, are identified by reference numeral 103. Parity blocks are identified by reference numeral 107. In contrast to the system shown in FIG. 1, it will be noticed that parity blocks 107 all reside on tape drive Drive 5.
In order to coordinate the operation of the multitude of disk or tape drives within an array to perform read and write functions, parity generation and checking, and data restoration and reconstruction, complex storage management techniques are required. Array operation can be managed through software routines executed by the host computer system, i.e., a software array architecture, or by a dedicated hardware controller constructed to control array operations.
A hardware array improves storage reliability and availability, improves system performance without modifying user applications and provides storage capacities larger than any single device. A software array architecture can deliver this functionality at a lower cost of implementation, and offer more storage management and configuration flexibility.
A software architecture allows existing storage to be used in the most efficient manner. All RAID levels can be configured at the same time, as well as multiple instances of any one RAID level. RAID levels can be configured on a subdisk basis (partition) which allows the configuration of multiple RAID levels on a single physical disk. Some RAID levels can also be nested to provide the highest level of performance and reliability possible (e.g., striped mirror). The increasing speed and power of host computer systems provides performance that is competitive with many hardware array products.
Additional improvements to disk array systems are desired to better utilize the speed and power of current and next generation computer systems, particularly multiple processor computer systems.