RAID storage systems have emerged as an alternative to large, expensive disk drives for use within present and future computer system architectures. A RAID storage system includes an array of small, inexpensive hard disk drives, such as the 51/4 or 31/2 inch disk drives currently used in personal computers and workstations. Although disk array products have been available for several years, significant improvements in the reliability and performance of small disk drives and a decline in the cost of such drives have resulted in the recent enhanced interest in RAID systems.
Numerous disk array design alternatives are possible, incorporating a few to many disk drives. Several array alternatives, each possessing different attributes, benefits and shortcomings, are presented in an article titled "A Case for Redundant Arrays of Inexpensive Disks (RAID)" by David A. Patterson, Garth Gibson and Randy H. Katz; University of California Report No. UCB/CSD 87/391, December 1987. The article, incorporated herein by reference, discusses disk arrays and the improvements in performance, reliability, power consumption and sealability that disk arrays provide in comparison to single large magnetic disks.
RAID level 1, discussed in the article, comprises N disks for storing data and N additional "mirror" disks for storing copies of the information written to the data disks. RAID level 1 write functions require that data be written to two disks, the second "mirror" disk receiving the same information provided to the first disk. When data is read, it can be read from either disk. A RAID level 1 system including four drives is depicted in FIG. 1. The drives are labeled DATA 1, MIRROR 1, DATA 2 and MIRROR 2. The blocks shown below the disk drives illustrate the manner in which data is stored on the disks.
RAID level 1 provides a high level of redundancy, high transaction performance, a minor write penalty and no recovery penalty. Although data availability is very high on RAID level 1 systems, the added expense and loss of available data storage capacity which result from supporting duplicate drives can be improved with RAID level 3, 4 and 5 systems.
RAID level 3 systems comprise one or more groups of N+1 disks. Within each group, N disks are used to store data, and the additional disk is utilized to store parity information. During RAID level 3 write functions, each block of data is divided into N portions for storage among the N data disks. The corresponding parity information is written to a dedicated parity disk. When data is read, all N data disks must be accessed. The parity disk is used to reconstruct information in the event of a disk failure. A RAID level 3 system including five drives is shown in FIG. 2. The disk drives are labeled DATA 1 through DATA 5. Data is striped across disks DATA. 1 through DATA 4, each data disk receiving a portion of the data being saved. Parity information, generated through a bit-wise exclusive-OR of the data stored on drives DATA 1 through DATA 4, is saved on drive DATA 5.
RAID level 3 provides data striping at either the byte or word level, very high data transfer rates and no penalty for write or data recovery operations. RAID level 3 systems provide best overall performance when used for large file transfers such as: decision support imaging, modeling and simulation, intensive graphics and image processing, scientific computing and CAD/CAM applications.
A RAID level 4 disk array is also comprised of N+1 disks wherein N disks are used to store data, and the additional disk is utilized to store parity information. However, data to be saved is divided into larger portions, consisting of one or more blocks of data, for storage among the disks. Writes typically require access to two disks, i.e., one of the N data disks and the parity disk. Read operations typically need only access a single one of the N data disks, unless the data to be read exceeds the block length stored on each disk. As with RAID level 3 systems, the parity disk is used to reconstruct information in the event of a disk failure. A RAID level 4 system including five drives is shown in FIG. 3. The disk drives are labeled DATA 1 through DATA 5. Data blocks are written across disks DATA 1 through DATA 4. Parity information, generated through a bit-wise exclusive-OR of the data stored on drives DATA 1 through DATA 4, is saved on drive DATA 5.
RAID level 5 disk arrays are similar to RAID level 4 systems except that parity information, in addition to the data, is distributed across the N+1 disks in each group. A RAID level 5 system is illustrated in FIG. 4. Each one of the N+1 disks within the array includes some blocks for storing data and some blocks for storing parity information. Where parity infomation is stored is controlled by an algorithm implemented by the user. As in RAID level 4 systems, RAID level 5 writes typically require access to two disks; however, no longer does every write to the array require access to the same dedicated parity disk, as in RAID level 4 systems. This feature provides the opportunity to perform concurrent write operations.
RAID level 5 provides data striping by system block size, parity distribution across all drives and improved transaction performance, but carries a significant write penalty. RAID level 5 systems are best utilized for super-computer or transaction processing applications requiring high I/O rates and small block sizes. RAID level 5 systems are ideal for the on-line processing needs of airline and automobile reservation centers, automatic teller and point-of-sale operations, and data base applications.
An additional disk array arrangement, referred to herein as RAID level 0, is depicted in FIG. 5. The array includes N data disks for storing data. Data is striped across the N data disks. The array controller accesses each drive independently, allowing up to N concurrent read or write operations at five different physical locations. This provides transparent load balancing and thus a performance improvement over a single disk drive. There is no parity generation or storage provided with RAID level 0, so there are no data recovery or reconstruction features as are provided with RAID levels 1, 3 and 5.
RAID level 0 provides data striping by system block size, high capacity, high transaction performance and no write penalty, but does not provide data recovery or extended data availability. This RAID level is best utilized for applications which require additional performance, but not the data availability provided by the other RAID levels.
In order to coordinate the operation of the multitude of disk drives within an array to perform read and write functions, parity generation and checking, and data restoration and reconstruction, complex storage management techniques are required. Array operation can be managed through software routines executed by the host computer system, i.e., a software array architecture, or by a dedicated hardware controller constructed to control array operations. Although each of the array configurations illustrated in FIGS. 1 through 4 includes a hardware controller, the host system could function as the array controller.
A hardware array controller improves storage reliability and availability, improves system performance and provides storage capacities larger than any single device. The hardware array controller provides this functionality utilizing minimal host processor time and without modifying user applications. A software array architecture can deliver this functionality at a lower cost of implementation, and offer more storage management and configuration flexibility than a typical hardware array controller. The increasing speed and performance of host computer systems provides software array architecture performance that is competitive with many hardware array controller products.
The hardware and software array alternatives discussed above provide improvements in performance, reliability, power consumption, sealability and capacity in comparison to single large magnetic disks. However, coincident with these improvements there exists a need to store and manage ever increasing amounts of data. Desired array management operations include the ability to reconfigure the array to change RAID levels or the size of data storage blocks, swap disk drives for preventive maintenance, perform manual sparing when a drive fails, increase the logical storage capacity, move storage to eliminate hot spots, add disks to increase throughput and tune cluster sizes to optimize load balancing. Furthermore, it is desired that the array data remain available to applications during array management operations. Thus, array management operations must be executed on-line, concurrent with normal disk array operations and transparent to most users of the system.