Field
Aspects of embodiments of the present invention are directed toward an accelerated erasure coding system and method.
Description of Related Art
An erasure code is a type of error-correcting code (ECC) useful for forward error-correction in applications like a redundant array of independent disks (RAID) or high-speed communication systems. In a typical erasure code, data (or original data) is organized in stripes, each of which is broken up into N equal-sized blocks, or data blocks, for some positive integer N. The data for each stripe is thus reconstructable by putting the N data blocks together. However, to handle situations where one or more of the original N data blocks gets lost, erasure codes also encode an additional M equal-sized blocks (called check blocks or check data) from the original N data blocks, for some positive integer M.
The N data blocks and the M check blocks are all the same size. Accordingly, there are a total of N+M equal-sized blocks after encoding. The N+M blocks may, for example, be transmitted to a receiver as N+M separate packets, or written to N+M corresponding disk drives. For ease of description, all N+M blocks after encoding will be referred to as encoded blocks, though some (for example, N of them) may contain unencoded portions of the original data. That is, the encoded data refers to the original data together with the check data.
The M check blocks build redundancy into the system, in a very efficient manner, in that the original data (as well as any lost check data) can be reconstructed if any N of the N+M encoded blocks are received by the receiver, or if any N of the N+M disk drives are functioning correctly. Note that such an erasure code is also referred to as “optimal.” For ease of description, only optimal erasure codes will be discussed in this application. In such a code, up to M of the encoded blocks can be lost, (e.g., up to M of the disk drives can fail) so that if any N of the N+M encoded blocks are received successfully by the receiver, the original data (as well as the check data) can be reconstructed. N/(N+M) is thus the code rate of the erasure code encoding (i.e., how much space the original data takes up in the encoded data). Erasure codes for select values of N and M can be implemented on RAID systems employing N+M (disk) drives by spreading the original data among N “data” drives, and using the remaining M drives as “check” drives. Then, when any N of the N+M drives are correctly functioning, the original data can be reconstructed, and the check data can be regenerated.
Erasure codes (or more specifically, erasure coding systems) are generally regarded as impractical for values of M larger than 1 (e.g., RAID5 systems, such as parity drive systems) or 2 (RAID6 systems), that is, for more than one or two check drives. For example, see H. Peter Anvin, “The mathematics of RAID-6,” the entire content of which is incorporated herein by reference, p. 7, “Thus, in 2-disk-degraded mode, performance will be very slow. However, it is expected that that will be a rare occurrence, and that performance will not matter significantly in that case.” See also Robert Maddock et al., “Surviving Two Disk Failures,” p. 6, “The main difficulty with this technique is that calculating the check codes, and reconstructing data after failures, is quite complex. It involves polynomials and thus multiplication, and requires special hardware, or at least a signal processor, to do it at sufficient speed.” In addition, see also James S. Plank, “All About Erasure Codes: —Reed-Solomon Coding—LDPC Coding,” slide 15 (describing computational complexity of Reed-Solomon decoding), “Bottom line: When n & m grow, it is brutally expensive.” Accordingly, there appears to be a general consensus among experts in the field that erasure coding systems are impractical for RAID systems for all but small values of M (that is, small numbers of check drives), such as 1 or 2.
Modern disk drives, on the other hand, are much less reliable than those envisioned when RAID was proposed. This is due to their capacity growing out of proportion to their reliability. Accordingly, systems with only a single check disk have, for the most part, been discontinued in favor of systems with two check disks.
In terms of reliability, a higher check disk count is clearly more desirable than a lower check disk count. If the count of error events on different drives is larger than the check disk count, data may be lost and that cannot be reconstructed from the correctly functioning drives. Error events extend well beyond the traditional measure of advertised mean time between failures (MTBF). A simple, real world example is a service event on a RAID system where the operator mistakenly replaces the wrong drive or, worse yet, replaces a good drive with a broken drive. In the absence of any generally accepted methodology to train, certify, and measure the effectiveness of service technicians, these types of events occur at an unknown rate, but certainly occur. The foolproof solution for protecting data in the face of multiple error events is to increase the check disk count.