1. Field of the Invention
The invention relates to error correction coding and in particular to a method and system of block coding particularly useful in a RAID 6 disk array storage system where the method and system are optimal as measured in terms of overhead storage requirements and are computationally simple.
2. Description of Related Art
As users have demanded increased reliability and capacity for computer storage subsystems, disk array storage systems have evolved as a solution to both needs. Disk array storage subsystems use multiple disk drives and distribute the data to be stored over the multiple disk drives. Distributing the data over multiple drives is a process commonly referred to as "striping." Striping the data over the disk drives enhances performance as compared to a single disk drive in that smaller amounts of data are written to or read from multiple disk drives in parallel. The total time to complete a particular read or write operation is therefore reduced because multiple disk drives perform the operation in parallel.
However, multiplying the number of disk drives used to store data increases the probability of a failure causing loss of data. The mean time between failure of multiple disk drives storing data is less than the mean time between failure of a single disk drive storing the same data. Storage arrays therefore provide for additional (overhead) storage to provide redundancy information used to recover data lost due to failure of other disk drives. Such loss of data is often referred to as "erasure" of the data. The redundancy information is used in general for two purposes. First, the redundancy information is used to restore data to a failed disk drive after it is repaired or replaced. Second, the redundancy information is used to allow continued operation of the storage array system while the failed disk drive is undergoing repair or replacement. In other words, lost data on the failed disk drive may be regenerated in real time by the storage array system using the redundancy information.
Typically, a storage controller device is used to perform requisite management of the array storage and redundancy generation and checking. The storage array is made to appear to the attached computer systems to be a single, large capacity, highly reliable disk drive. The data to be stored on the array is "mapped" in that the storage capacity of the array may be addressed as a large linear vector of blocks addressed by a logical block address. The storage controller receives requests to manipulate data from an attached computing system which identifies affected blocks using the logical block addresses. The storage controller maps or translates such information into lower level I/O operations which direct the read and write operations to appropriate physical locations on appropriate disk drives of the array. In addition, the storage controller performs all requisite generation and checking of the redundancy information associated with the affected data. For example, where the computer system writes new or updated data to the storage array, the storage controller assures that affected redundancy information is also updated. Or for example, where the computer system requests a read of data and a disk drive in the array has failed, the requested data is retrieved from the remaining operational disk drives and any missing data lost due to failure of a disk drive is regenerated using the redundancy information. The storage controller therefore performs storage management so as to make the data distribution over the disk array and the redundancy information generation and use transparent to attached computer systems.
RAID management is a storage management technique commonly used in present storage subsystems. RAID is an acronym for Redundant Array of Inexpensive Disks. The 1987 publication by David A. Patterson, et al., from University of California at Berkeley entitled A Case for Redundant Arrays of Inexpensive Disks (RAID), reviews the fundamental concepts of RAID technology. There are several "levels" of standard geometries defined in the Patterson publication. The simplest array which provides for redundancy information, a RAID level 1 system, comprises one or more disks for storing data and an equal number of additional "mirror" disks for storing copies of the information written to the data disks. Other RAID levels, identified as RAID levels 2-4, segment the data into portions for storage across several data disks and use Exclusive-OR (XOR) parity as redundancy information to enhance the reliability of the array. An additional disk is used to store the XOR parity information. RAID level 5 also distributes the parity information over the entire array.
RAID level 5, for example, imposes a penalty on write operations in that an update to the data in the storage array requires an additional operation to update the associated parity block. RAID storage controllers provide a number of techniques to lessen the impact of this so called write penalty.
XOR parity as provided in RAID 2-5 guards against the loss of data from failure of a single disk drive in the array. When a single disk drive fails, the data lost on that disk drive is reconstructed by performing an XOR of the related blocks of the data in a corresponding stripe on the remaining operational disk drives. As noted above, the lost data may be reconstructed in real time to continue operation of the array despite the loss of a single disk and may be reconstructed at the time of replacement of the failed disk drive with a replacement or repaired disk.
RAID level 6 is a further development wherein a second drive having redundancy information is used to guard against failure of two disk drives in the storage array. The second redundancy scheme is independent of the first to assure recovery of a two disk failure. The write penalty noted above is similar in RAID level 6 except greater in magnitude (twice the effect). As with RAID level 5 noted above, most RAID level 6 systems utilize techniques within the storage controller to mitigate the affects of the write penalty. As with RAID 5, though the controller may mitigate the effects of the write penalty, such mitigating techniques tend merely to defer the timing of the effects of the write penalty. None-the-less, the write penalty remains an important factor in overall performance of an array storage system.
A number of techniques have been applied to implement RAID level 6 by providing two independent redundancy information schemes. Each presently known approach has certain strengths and offsetting weaknesses.
U.S. Pat. No. 5,579,475 to Blaum et al. discloses a method for encoding and rebuilding data contents of up two unavailable disks in a disk array (see also Blaum et al., "EVENODD: An Optimal Scheme for Tolerating Double Disk Failures in RAID Architectures", IEEE Pub. No 1053-6997/94). This method, referred to as "EVENODD" coding, utilizes only XOR operations, and requires fewer and simpler computations than previous schemes. The EVENODD method (as explained below), however, is not optimally efficient for applications involving a high frequency of small write operations. Such applications often require that more than 2 XOR parity blocks be updated in response to the update of a small portion of data. In other words, the EVENODD method may impose a significant performance penalty (write penalty on applications which frequently perform small write operations).
Another previously known method for implementing RAID level 6 is based on Reed-Solomon coding (often referred to as Reed-Solomon PQ) requires finite field computations. Such finite field computations are substantially more complex than simple XOR parity computations. In view of the computational complexity, most RAID 6 systems using Reed-Solomon PQ encoding have special custom electronic circuits (ASICs) to assist in the requisite computations for redundancy. Use of such special ASIC devices adds complexity to the storage controller and also precludes use of existing controller devices which are devoid of such Reed-Solomon computational assist circuits. In other words, existing RAID storage subsystems (i.e., RAID level 5 systems) would not be capable of implementing RAID level 6 due to the lack of Reed-Solomon computational assistance.
Still another technique referred to as 2-D Parity (as in two dimensional parity) applies simple XOR parity computations to both the rows of an array and to the columns of the array. The disk array in 2-D Parity encoding techniques is a two dimensional array of disk drives consisting of R.times.C data disks and R+C redundancy disks. The R.times.C data disks are arranged in a rectangular matrix of R rows and C columns. For each of the R rows of data disks there is a corresponding redundancy disk storing redundancy information (e.g., XOR parity information). Likewise, for each of the C columns, there is a corresponding redundancy disk storing redundancy information (e.g., XOR parity information). This technique provides for computational simplicity in that it uses only XOR computations. However, the 2-D Parity technique does not achieve optimal results in terms of storage capacity required for storage of redundancy information. Rather, the storage capacity (overhead storage capacity) required for 2-D Parity is R+C drives which exceeds that of other techniques. Further, the 2-D Parity encoding technique provides less protection for partial disk failures as compared to other techniques.
It can therefore be seen that existing methods for error recovery in RAID level 6 systems, in particular, present problems in a number of areas. It is therefore desirable to provide an improved method which requires minimal storage capacity for storage of redundancy information, is computationally simple using only XOR computations, and minimizes the write penalty suffered in applications which frequently perform small write operations.