1. Field of the Invention
The present invention is generally related to disk array architectures, and, specifically, to disk array architectures that provide disk fault tolerance.
2. Related Art
It is known to store data in an array of disks managed by an array controller to control the storage and retrieval of data from the array. One example of such a system is a Redundant Array of Independent Disks (RAID) comprising a collection of multiple disks organized into a disk array managed by a common array controller. The array controller presents the array to the user as one or more virtual disks. Disk arrays are the framework to which RAID functionality is added in functional levels to produce cost-effective, high-performance disk systems having varying degrees of reliability based on the type of RAID architecture implemented. RAID architecture can be conceptualized in two dimensions as individual disks arranged in adjacent columns. Typically, each disk is partitioned with several identically sized data partitions known as strips, or minor stripes. Distributed across the array of disks in rows, the identically sized partitioned strips form a data stripe across the entire array of disks. Therefore, the array contains stripes of data distributed as rows in the array, wherein each disk is partitioned into strips of identically partitioned data and only one strip of data is associated with each stripe in the array.
As is known, RAID architectures have been standardized into several categories. RAID level 0 is a performance-oriented striped data mapping technique incorporating uniformly sized blocks of storage assigned in a regular sequence to all of the disks in the array. RAID level 1, also called mirroring, provides simplicity and a high level of data availability, but at a relatively high cost due to the redundancy of the disks. RAID level 3 adds redundant information in the form of parity data to a parallel accessed striped array, permitting regeneration and rebuilding of lost data in the event of a single-disk failure. RAID level 4 uses parity concentrated on a single disk to allow error correction in the event of a single disk failure, but the member disks in a RAID 4 array are independently accessible. In a RAID 5 implementation, parity data is distributed across some or all of the member disks in the array. Thus, the RAID 5 architecture achieves performance by striping data blocks among N disks, and achieves fault-tolerance by using 1/N of its storage for parity blocks, calculated by taking the exclusive-or (XOR) of all data blocks in the parity disks row. A RAID 6 architecture is similar to RAID 5, but RAID 6 can overcome the failure of any two disks by using an additional parity block for each row (for a storage loss of 2/N). The first parity block (P) is calculated with XOR of the data blocks. The second parity block (Q) employs Reed-Solomon codes. One drawback of the known RAID 6 implementation is that it requires a complex and computationally time-consuming array controller to implement the Reed-Solomon codes necessary to recover from a two-disk failure. The complexity of Reed-Solomon codes may preclude the use of such codes in software, and may necessitate the use of expensive special purpose hardware. Thus, implementation of Reed-Solomon codes in a disk array increases the cost, complexity, and processing time of the array.
In addition, other schemes have been proposed to implement two disk fault protection, such as the scheme described in U.S. Pat. No. 6,353,895. While these schemes provide fault tolerance in the case of two simultaneous disk failures, the techniques are not readily scalable to accommodate more than two simultaneous drive failures, such as a three drive failure. Importantly, as the number of drives in an array becomes increasingly larger, the statistical probability of more than two disks failing simultaneously increases and, consequently, more than two drive fault tolerance is required. However, it is believed that three drive fault recovery techniques have not been used in disk array architectures or RAID systems.
Thus, it would be desirable to provide system and method for implementing a three disk fault recovery architecture that is not subject to complex and computationally time-consuming array control functions encountered in known disk fault tolerance implementations. In addition, it would also be desirable to provide a method that does not limit the size or configuration of the array. Further, it would be desirable to limit the number of additional disks required to implement three disk fault tolerance.
Generally, the present invention fulfills the foregoing needs by providing in one aspect thereof, a method for providing three disk fault tolerance in an array of disks indexed and organized into a plurality of indexed stripes, each stripe including strips indexed by both disk and stripe, and each of the strips being located on a single disk. The method includes arranging strips containing data into horizontal and diagonal parity sets, each parity set including at least one data strip as a member and no single data strip being repeated in any one parity set. The method also includes grouping the diagonal parity sets into two groupsxe2x80x94Group 1 and Group 2xe2x80x94such that each data strip is a member of a unique diagonal parity set in Group 1 and also a member of another unique diagonal parity set in Group 2. The method further includes calculating a horizontal parity for each horizontal parity set and calculating a diagonal parity for each diagonal parity set. The method also includes storing the calculated horizontal parity of each horizontal parity set in a strip of a horizontal parity disk. The method further includes storing at least some of the calculated diagonal parities of each diagonal parity set in corresponding strips of a diagonal parity disk, and storing the remainder of the calculated diagonal parities in corresponding strips on a diagonal parity stripe so that each diagonal parity is stored in a strip of the diagonal parity stripe with a disk index different from all the members of its contributing diagonal parity set.
The present invention further provides, in another aspect thereof, a system for providing disk fault tolerance in an array of independent disks. The system includes an array of disks consecutively indexed and organized into indexed stripes. Each stripe further includes strips indexed by both disk and stripe, and each of the strips in any one of the stripes being located on a single disk. The system further includes an array controller coupled to the disk array and configured to arrange the strips containing data into horizontal and diagonal parity sets, with each set including at least one data strip as a member. The array controller is also configured to group the diagonal parity sets into two groups of diagonal parity setsxe2x80x94Group 1 and Group 2. The array controller is further configured to calculate the corresponding horizontal and diagonal parities for each of the parity sets, and to store each of the calculated parities in a corresponding strip.