Erasure-resilient codes enable lossless data recovery notwithstanding loss of information during storage or transmission. Erasure-resilient codes, which are derived from error correction codes, are designed to be tolerant of data loss. Erasure-resilient codes add redundant information to the stored or transmitted data. Thus, erasure-resilient codes take an original message and generate redundant data (or coded messages) from the original message. These coded messages are a mathematical combination of the original message. The original message is encoded into a plurality of encoded messages. If one or more of the encoded messages is lost, it is possible to recover the original message in a lossless manner. In general, adding more coded messages allows lossless recovery of the original message at higher error or data loss rates. However, this also reduces the transmission or storage efficiency of the associated system.
High-rate erasure-resilient codes are block error correction codes with a large coded message space. In other words, high rate means that the number of coded messages is much larger than the number original messages. This allows the high-rate erasure-resilient codes to be used for high error or data loss applications, especially content distribution and backup applications. By way of example, one application is the digital fountain paradigm, where a server multicasts or broadcasts erasure coded messages non-stop to a plurality of clients. Each client may tune in from time to time to receive the coded messages that are being sent at that moment. Another application is the distributed backup, where a file is erasure encoded and stored in a large number of storage units, either locally or in a distributed fashion. During the recovery process, the client attempts to restore the file from the accessible storage units. Yet another application is the distributed content hosting, streaming, or both. In this application, a file or media is distributed to a number of hosting servers, each of which may elect to host a portion of the file or media in the erasure coded form. During the retrieval process, the client locates the hosting servers that are willing to serve and retrieves the erasure coded file or media simultaneously from those servers. In each of the above applications, a file, media, or both are encoded into a large number of distinctive coded messages. During retrieval, the client attempts to use a minimum number of the coded messages that are equal to or slightly larger than the original messages to recover the file or media. The client usually does not have control over what coded messages are available. As a result, the process of distributed content broadcast, backup, hosting, or streaming can be considered as passing the coded messages through an erasure channel with heavy loss, and recovering the messages afterwards.
A number of error correction codes can serve as high-rate error-resilient codes. These codes include the random generation of linear codes (RLC), the low-density parity check (LDPC) codes, turbo codes, LT codes, and Reed-Solomon codes. Among the error correction codes, the Reed-Solomon codes stand out with a number of unique properties. For instance, Reed-Solomon codes are maximum distance separable (MDS) codes. They achieve the maximum channel coding efficiency, and are able to decode the original messages with the exact number of received coded messages. Because the generator matrix of the Reed-Solomon codes are structured, the Reed-Solomon coded message can be identified by the row index, which reduces the overhead needed to identify the coded message. Reed-Solomon codes can be applied to messages with small access granularity (short message and small number of original messages), and are suitable for on demand distributed content hosting/streaming applications.
Reed-Solomon erasure resilient codes are typically used in low-rate applications (such as satellite communications). These codes are not designed to operate in high-rate environments. Although the implementation of the Reed-Solomon error correction codes has been extensively investigated, there are relatively few works on the efficient implementation of the Reed-Solomon codes for high rate error resilient coding application, which bears unique characteristics. For example, in a high-rate erasure resilient coding scenario, the coded messages are generated on demand, just prior to message distribution. This is different from the error correction coding application, where all coded messages are generated at the same time. Another difference is that much more parity messages are generated and received in high-rate erasure resilient coding. Another difference is that low-rate applications have a much larger k number of original messages than the number of encoded messages (n−k). Also, low-rate applications typically do not have a long string of symbols/vectors per original/coded messages. In contrast, in high-rate applications described above, each original/coded message is usually consisted of a long string of symbols that is approximately an order of magnitude larger than k. Thus, existing implementations of Reed-Solomon error correction codes may not work efficiently and quickly in high-rate erasure resilient encoding/decoding operations.
Both the low-rate and high-rate Reed-Solomon codes include a generator matrix that uses the Galois field. However, the low-rate generator matrix has a much different composition than in the high-rate case. For example, when used in a low-rate application (such as in satellite communications), the generator matrix has a large identity matrix on the top part of the matrix. So the generator matrix for low-rate applications is not nearly as full as the generator matrix for high-rate applications. Many technologies used in low-rate applications for fast Reed-Solomon encoding/decoding (such as Fourier transform) do not apply and cannot be used in the high-rate applications.
Therefore, what is needed is system and method that generates and implements Reed-Solomon erasure-resilient codes in high-rate applications in an efficient manner. What is also needed is a system and method that provides tuning of high-rate Reed-Solomon erasure-resilient codes such that they are designed to operate and be used in high-rate applications.