Stored data may be protected against storage media failures or other loss by storing extra copies, by storing additional redundant information, or in other ways. One type of redundancy based protection involves using erasure coding. Erasure coding uses additional redundant data to produce erasure codes (EC) that protect against ‘erasures’. An erasure may be an error with a location that is known a priori. Erasure codes allow data portions that are lost to be reconstructed from the surviving data. The application of erasure codes to data storage may typically have been for the purpose of recovering data in the face of failures of hardware elements storing the data.
An erasure code is a forward error correction scheme for storage applications. An erasure code transforms a message of k symbols into a longer message. The longer message may be referred to as a code-word with n=k+p symbols such that the original message can be recovered from any available k symbols. Reed Solomon (RS) codes are optimal erasure codes because any available k symbols will suffice for successful reconstruction. However, their encoding and decoding complexity grows at least quadratically with the block length and the number of parity symbols generated. Cauchy RS codes, which are in the Reed-Solomon family of codes, may possess reduced complexity encoding and/or decoding methods, but still suffer from exacerbated Galois field computations and do not scale well with increasing code-word length.
In an encoding agent, k segments of useful data are used to compute a number p parity segments. For convenience, the segment sizes are conventionally chosen to be equal. If a coding agent can compensate for any combination of p segment losses out of k+p total number of segments, the coding agent is said to have an optimal overhead of zero. RS codes are known to be optimal in terms of the number of erasures that can be corrected given the coded block and message block lengths. However, the computational complexity of RS codes may be prohibitive for large block lengths, resulting in k+p including a large number of parity segments, chunks (or symbols) p that may reduce the efficiency of a storage system. Thus, as the coded block length increases, RS codes become less than optimal. RS codes may also become less than optimal when multiple RS encodings are used for a large file, due to increased interleaving overhead.
Additionally, the number of parity symbols p inversely affects the encoding and decoding complexity. Thus, RS codes may only be practical for short block length storage applications where RAID5 or RAID6 type of protection is intended. Furthermore, conventional approaches using RS codes may be prohibitively complex due to large block length/large parity Galois field arithmetic operations. On the other hand, conventional approaches that try to overcome the problems of RS codes by using Fountain erasure codes may suffer from prohibitive overhead inefficiency due to the short block lengths used by Fountain erasure codes.
In a storage system, reliability and efficiency are two main concerns. One of the main objectives of distributed storage or cloud storage is to ensure the reliable protection of user data. However, reliability and efficiency are often conflicting goals. Greater reliability may be achieved at the cost of reduced efficiency. Higher efficiency may be attained at the cost of reduced reliability. Storage devices are subject to wear, damage, and component aging. Storage device reliability decreases over time. An erasure code, or its associated parameters, selected at one point in a storage device's life span may not be the most efficient erasure code or associated parameters for a different period in the storage device's lifespan.
The nature of data and associated storage methods vary dramatically depending on the application. For example, delay sensitive data delivery applications may require short block lengths. Transactional data is one example of such delay sensitive, short block length data. For example, electronic mail usually constitutes short message lengths while high definition video may impose larger message length constraints, Large block size, as used for storing high definition video, may lead to large blocks being pushed to kernel modules to be stored on physical devices. Overhead inefficiency is a particularly important problem for storage applications in which user capacity maximization is one of the outstanding goals given the scale of the overall storage system.