1. Field of the Invention.
This invention relates in general to computer disk drives, and in particular, to a disk drive matrix having enhanced fault tolerance.
2. Description of Related Art.
One technique for achieving media fault tolerance involves a design termed "Redundant Arrays of Inexpensive Disks" (RAID). The RAID design offers an attractive alternative to single large expensive disk drives, promising improvements of an order of magnitude and performance, reliability, power consumption and scalability. Five levels of RAID design, termed RAID-1 through RAID-5, are known in the art and are described in the following publication, which is incorporated by reference herein: D. A. Patterson, G. Gibson, and R. H. Katz, "A Case for Redundant Arrays of Inexpensive Disks (RAID)", ACM SIGMOD Conference, Chicago, Ill., Jun. 1-3, 1988.
The RAID-1 design consists of mirrored disk drives, which are a traditional approach for improving reliability However, while such a design provides greater read throughput and fault tolerance than a single disk drive it doubles the cost of data storage.
In the RAID-2 design, a plurality of bits of a logical unit of data are striped simultaneously and in parallel across a plurality of disk drives, so that an individual read or write operation spans the entire plurality of disk drives. Parallel check disks are added to detect and correct errors. In the RAID-2 design, a single parity disk can detect a single error, but additional check disks are required to determine which disk drive failed.
The RAID-3 design is similar to the RAID-2 design, but eliminates the additional check disks of RAID-2 as being redundant, since most disk drive controllers can detect when a disk drive fails, either through special signals provided in the disk drive interface or by using error correcting codes (ECCs) stored with each sector of data. Therefore, data on a failed disk drive can be reconstructed by calculating the parity of the surviving disk drives and then comparing it bit-by-bit to the parity calculated for the original full group.
The RAID-4 design is similar to the RAID-3 design, except that a plurality of sectors of a logical unit of data are striped across a plurality of disk drives. Thus, individual reads may be performed independently, so as to maximize retrieval rates. While the RAID-4 design enhances read operations, write operations are still limited since every write operation must read and write a parity disk.
The RAID-5 design is similar to the RAID-4 design, except that placement of parity blocks is rotated among the disk drives in the array. This rotation lessens the contention for the parity disk drive during write operations experienced by the RAID-4 design. The RAID-5 design provides the random access read throughput of the RAID-1 design with the reduced redundancy cost of the RAID-3 design. However, in the presence of a failure, the random read throughput of the RAID-5 design drops by 50% because the corresponding data and parity sectors on all surviving disk drives must be read to reconstruct the data sectors on the failed disk drive.