RAID (“redundant array of independent disks”) is a well-known storage technology combining multiple disk drive components into a single logical unit in order to provide, among other things, data redundancy and error correction. The term RAID designates a number of methods for creating and maintaining error recovery information. In general, RAID operates on the principle of storing data across a number of different independent drives as well as error recovery data derived from the stored data. Should any one of the drives fail, the data stored thereon can be reconstructed or recovered from the error recovery data and the data stored on the remaining drives using a particular method. In typical implementations, the method may include simple exclusive-or (“XOR”) operations or more complex Reed-Solomon codes or even more general erasure codes.
For example, a particular RAID scheme may employ four data drives for storing data, plus a fifth parity drive for storing parity information. The data to be stored in the system may be divided into four segments, with a quarter of the data being stored on each data drive. When the data is stored, parity information, ‘P’, is determined from the data by calculating the XOR sum of the four data segments, ‘A’, ‘B’, ‘C’, ‘D’, as follows:A^B^C^D=P  (1)Since XOR is fully communicative, the order of presentation of the four data segments is irrelevant, and any particular segment may be determined from the parity information along with the remaining data segments, as follows:A=B^C^D^P  (2)This result follows from the fact that anything XOR'ed with itself equals zero.
Updating a single segment consists of removing an old data image and replacing it with a new data image. This is accomplished by recognizing that XOR of parity information, P, with any data segment, e.g. A, removes that data segment from the parity information. In other words, if:A^B^C^D=P  (3)thenP^A=B^C^D  (4)again, since anything XOR'ed with itself equals zero. If new data A′ is then XOR'ed with parity P, then the resulting parity P′ is correct to all segments including A. In other words, generating new parity P′ is done as follows:P′=P^A^A′  (5)This expression may be verified by substituting P^A with B^C^D, yielding:P′=B^C^D^A′  (6)which is expected based on the original form of Equation (1) above. In general, data update is carried out according to Equation (5) instead of Equation (6) as the former is more efficient as it requires fetching only the data member being updated along with the parity member.
As is known in the art, parity information comprising XOR sums is but one example of a more general class of erasure codes, which also includes XOR sums computed using Galois fields, Reed-Solomon codes, and specialized versions of these such as Cauchy-Reed-Solomon and Vandermonde-Reed-Solomon codes. In all such methods, an original message is transformed into a longer message whereby the original message can be calculated from a subset of the longer message.
Erasure codes allow for recovery from n failures by encoding n error correction codes for k symbols of data. The total of space needed then is n+k. In essence, erasure codes employ the principle that if there are no more than n unknowns a unique solution can be obtained because there are n equations. As long as the number of equations is greater than or equal to the number of unknowns, linear algebraic methods can be used to solve for any set of unknowns.
In general, a RAID system has three requirements, as follows.
Firstly, a RAID implementation requires transactional controls. RAID operations produce coherent updates to discrete devices by insuring that said updates are performed in isolation. Should coordination of updates break down, then stored erasure codes cannot be guaranteed to be correct with respect to the data.
Secondly, the implementation must enable the discrete operations themselves. RAID performs a series of algebraic functions to create and maintain erasure codes.
Thirdly, the implementation must provide some space for the RAID operations. Some scratchpad buffers are necessary to work through the creation and maintenance of erasure codes.
In view of the requirements, RAID has been implemented primarily with a specialized controller. The controller, in addition to providing translation from host to device protocols, provides transactional controls and memory space for RAID operations.
The provision and use of a RAID controller requires, however, the expenditure of resources, and further complicates and retards data throughput between the storage drives and a host accessing the RAID system. It would be desirable, therefore, to provide a RAID implementation which does not require a separate controller, but which provides all of the necessary functionality.