In communication and storage systems the loss of data during transmission is a common problem. A widely applied technique for compensating data errors that occur during transmission is forward error correction (FEC). For FEC error control the sender adds redundant data to the messages sent, thereby allowing the receiver to detect and correct errors, so that no retransmission is needed.
Typically, a message of a given number of blocks is transformed into a forward error corrected message which comprises an overhead added to the original blocks, allowing the original message to be recovered, even if only a subset of the blocks of the forward error corrected message are correctly received. In many FEC codes the ratio of message data and overhead is fixed, so that depending on the expected error rate typically different codes with different overhead sizes are utilized.
So-called rateless codes are more flexible, since they have the property of generating a potentially limitless sequence of encoding symbols from a given set of source symbols, such that the original source symbols can be recovered from any subset of the encoding symbols of equal or slightly larger size than the number of source symbols. Today, rateless codes are well known tools used to transmit information, i.e. input symbols, as encoded symbols over lossy communication channels in order to protect the information from channel losses. Rateless codes are also used to store and replicate information on storage devices, where the original input symbols are stored as encoded symbols, and can be extracted from encoded symbols when needed. When some of the encoded symbols are corrupted or otherwise unavailable due to storage device and/or read errors, the input symbols can still be extracted from the encoded symbols provided that an adequate number of encoded symbols is available to the decoding host.
The key to the successful functioning of rateless codes is that the information in the input symbols is spread across the encoded symbols. Then, given a sufficient number of encoded symbols, all the input symbols can be decoded from these encoded symbols, irrespective of the particular encoded symbols available to the decoding host. Therefore, irrespective of any particular encoded symbol(s) lost in the communication channel or storage device, the input symbols can be recovered as long as an adequate number of encoded symbols is available to the decoding host.
These codes are called rateless codes because of their design—the encoding host can practically produce an unending stream of encoded symbols as required—to use later. Rateless codes sometimes impose a small overhead because the number of encoded symbols required at the decoding host is slightly more than the total number of input symbols. In addition, there is the small overhead of transmitting meta-information needed for recovering the input symbols at the decoding host.
Random linear codes are a class of rateless codes known to have a very low communication overhead, in fact they can be designed so that the number of encoded symbols required for decoding is almost always equal to the number of input symbols. However, they are computationally expensive in encoding and decoding operations, and as such, impractical for a large number of input symbols.
In U.S. Pat. No. 6,373,406 B2 LT codes (Luby transform codes) are described, which have recently become popular because of their low encoding and decoding complexity, which however is achieved at the cost of a slightly larger communication overhead as compared to random linear codes. LT codes, when enhanced with an outer code, form the basis for Raptor codes, recently proposed for large scale information distribution in wireless networks and described in “Raptor Codes” by A. Shokrollahi, IEEE Transactions on Information Theory, 52(6), 2551-2567, 2006.
However, all the above described codes for forward error correction are designed for transmitting data from a transmitting host to a receiving host, wherein the receiving host has no prior information about the data to be transmitted. In particular, the described codes are not adapted for transmitting incremental data changes.