RAID stands for Redundant Array of Independent Disks and is a taxonomy of redundant disk array storage schemes which define a number of ways of configuring and using multiple computer disk drives to achieve varying levels of availability, performance, capacity and cost while appearing to the software application as a single large capacity drive. Typical RAID storage subsystems can be implemented in either hardware or software. In the former instance, the RAID algorithms are packaged into separate controller hardware coupled to the computer input/output (“I/O”) bus and, although adding little or no central processing unit (“CPU”) overhead, the additional hardware required nevertheless adds to the overall system cost. On the other hand, software implementations incorporate the RAID algorithms into system software executed by the main processor together with the operating system, obviating the need and cost of a separate hardware controller, yet adding to CPU overhead.
Various RAID levels have been defined from RAID-0 to RAID-6, each offering tradeoffs in the previously mentioned factors. RAID-0 is nothing more than traditional striping in which user data is broken into chunks which are stored onto the stripe set by being spread across multiple disks with no data redundancy. RAID-1 is equivalent to conventional “shadowing” or “mirroring” techniques and is the simplest method of achieving data redundancy by having, for each disk, another containing the same data and writing to both disks simultaneously. The combination of RAID-0 and RAID-1 is typically referred to as RAID-0+1 and is implemented by striping shadow sets resulting in the relative performance advantages of both RAID levels. RAID-2, which utilizes Hamming Code written across the members of the RAID set is not now considered to be of significant importance.
In RAID-3, data is striped across a set of disks with the addition of a separate dedicated drive to hold parity data. The parity data is calculated dynamically as user data is written to the other disks to allow reconstruction of the original user data if a drive fails without requiring replication of the data bit-for-bit. Error detection and correction codes (“ECC”) such as Exclusive-OR (“XOR”) or more sophisticated Reed-Solomon techniques may be used to perform the necessary mathematical calculations on the binary data to produce the parity information in RAID-3 and higher level implementations. While parity allows the reconstruction of the user data in the event of a drive failure, the speed of such reconstruction is a function of system workload and the particular algorithm used.
As with RAID-3, the RAID scheme known as RAID-4 consists of N data disks and one parity disk wherein the parity disk sectors contain the bitwise XOR of the corresponding sectors on each data disk. This allows the contents of the data in the RAID set to survive the failure of any one disk. RAID-5 is a modification of RAID-4 which stripes the parity across all of the disks in the array in order to statistically equalize the load on the disks.
The designation of RAID-6 has been used colloquially to describe RAID schemes that can withstand the failure of two disks without losing data through the use of two parity drives (commonly referred to as the “P” and “Q” drives) for redundancy and sophisticated ECC techniques. Although the term “parity” is used to describe the codes used in RAID-6 technologies, the codes are more correctly a type of ECC code rather than simply a parity code. Data and ECC information are striped across all members of the RAID set and write performance is generally lower than with RAID-5 because three separate drives must each be accessed twice during writes. However, the principles of RAID-6 may be used to recover a number of drive failures depending on the number of “parity” drives that are used.
Some RAID-6 implementations are based upon Reed-Solomon algorithms, which depend on Galois Field arithmetic. A complete explanation of Galois Field arithmetic and the mathematics behind RAID-6 can be found in a variety of sources and, therefore, only a brief overview is provided below as background. The Galois Field arithmetic used in these RAID-6 implementations takes place in GF(2N). This is the field of polynomials with coefficients in GF(2), modulo some generator polynomial of degree N. All the polynomials in this field are of degree N-1 or less, and their coefficients are all either 0 or 1, which means they can be represented by a vector of N coefficients all in {0,1}; that is, these polynomials “look” just like N-bit binary numbers. Polynomial addition in this Field is simply N-bit XOR, which has the property that every element of the Field is its own additive inverse, so addition and subtraction are the same operation. Polynomial multiplication in this Field, however, can be performed with table lookup techniques based upon logarithms or with simple combinational logic.
Each RAID-6 check code (i.e., P and Q) expresses an invariant relationship, or equation, between the data on the data disks of the RAID-6 array and the data on one or both of the check disks. If there are C check codes and a set of F disks fail, F≦C, the failed disks can be reconstructed by selecting F of these equations and solving them simultaneously in GF(2N) for the F missing variables. In the RAID-6 systems implemented or contemplated today there are only 2 check disks—check disk P, and check disk Q. It is worth noting that the check disks P and Q change for each stripe of data and parity across the array such that parity data is not written to a dedicated disk but is, instead, striped across all the disks.
Even though RAID-6 has been implemented with varying degrees of success in different ways in different systems, there remains an ongoing need to improve the efficiency and costs of providing RAID-6 protection for data storage. The mathematics of implementing RAID-6 involve complicated calculations that are also repetitive. Accordingly, efforts to improve the simplicity of circuitry, the cost of circuitry and the efficiency of the circuitry needed to implement RAID-6 remains a priority today and in the future.
For example, one drawback with conventional RAID-6 designs relates to the throughput of parity updates in such designs due to comparatively higher buffer requirements for performing such updates. A parity update, in this context, refers to updating the parity information stored in a given parity stripe in a disk array in response to a change in the data stored in the parity stripe.
By way of comparison, in a RAID-5 design, a parity update operation typically requires, first, that the old data to be updated be retrieved from the appropriate disk and compared to the new data to calculate a delta value Δ, e.g., by performing an exclusive-OR (XOR) operation with the old and new data. This delta value Δ is then used to update the parity, e.g., by performing an XOR operation with the old parity value and the delta value Δ.
Of note, the two XOR operations performed during a RAID-5 parity update can be implemented using only two buffers. Specifically, a first buffer can be used to initially store the new data to be written, with a second buffer used to store the delta value Δ generated from the XOR of the new data stored in the first buffer and old data retrieved from the appropriate disk (which can be fed directly to XOR logic, and thus does not need to be buffered). Once the new data is written to the appropriate disk, the first buffer can then be reused to store the result of the XOR operation between the delta value Δ stored in the second buffer and the old parity value retrieved from the appropriate disk (which also can be fed directly to XOR logic without buffering). After this second XOR operation, the first buffer stores the new parity value, which can then be read out of the first buffer and written to the appropriate disk.
On the other hand, in a RAID-6 environment, where two disks include parity data for each parity stripe, parity update operations typically require at least one additional buffer to store interim data associated with such operations. In particular, while RAID-6 parity updates still require the calculation of a delta value Δ, updating both of the parity values for a given parity stripe require that the delta value Δ be multiplied or scaled by different constants that are respectively associated with the different parity stripe equations that relate the parity values to the data in the parity stripe equation. These different constants are conventionally designated as constants K1 and K2, and as such, one parity value P for a parity stripe is typically calculated by performing an XOR of the old parity value P and the product of constant K1 and delta value Δ (i.e., K1Δ), while the other parity value Q for the same parity stripe is typically calculated by performing an XOR of the old parity value Q and the product of constant K2 and delta value Δ (i.e., K2Δ).
In a conventional RAID-6 implementation, three or more buffers are used, with a first buffer initially storing the new data to be written, and with a second buffer used to store the delta value Δ generated from the XOR of the new data stored in the first buffer and old data retrieved from the appropriate disk. Since the delta value Δ is required for both parity values, a third buffer is used to store the product K1Δ of the delta value Δ stored in the second buffer and constant K1. Also, similar to a RAID-5 implementation, after the new data is written to the appropriate disk, the first buffer is reused to store the result of an XOR operation between the product K1Δ stored in the third buffer and the old parity value P retrieved from the appropriate disk, which result is then written back out to the appropriate disk as the new parity value P. To update parity value Q, after the parity value P is updated, the third buffer is then reused to store the product K2Δ of the delta value Δ stored in the second buffer and constant K2, and the first buffer is then reused to store the result of an XOR operation between the product K2Δ now stored in the third buffer and the old parity value Q retrieved from the appropriate disk. This result, now stored in the first buffer, is then written back out to the appropriate disk as the new parity value Q.
Utilizing three buffers for a RAID-6 parity update, however, increases buffer requirements by 50% over the two buffers required for RAID-5 designs. As a result, in situations where the number of buffers is constrained, a RAID-6 design typically can have only ⅔rd the number of parity update operations in progress in comparison to a RAID-5 design, thus reducing throughput and limiting overall performance.