Transmission of data between a sender and a recipient over a communications channel has been the subject of much literature. Preferably, but not exclusively, a recipient desires to receive an exact copy of data transmitted over a channel by a sender with some level of certainty. Where the channel does not have perfect fidelity (which covers most of all physically realizable systems), one concern is how to deal with data lost or garbled in transmission. Lost data (erasures) are often easier to deal with than corrupted data (errors) because the recipient cannot always tell when corrupted data is data received in error. Many error-correcting codes have been developed to detect and/or correct for erasures and/or for errors. Typically, the particular code used is chosen based on some information about the infidelities of the channel through which the data is being transmitted and the nature of the data being transmitted. For example, where the channel is known to have long periods of infidelity, a burst error code might be best suited for that application. Where only short, infrequent errors are expected, a simple parity code might be best.
Data transmission between multiple senders and/or multiple receivers over a communications channel has also been the subject of much literature. Typically, data transmission from multiple senders requires coordination among the multiple senders to allow the senders to minimize duplication of efforts. In a typical multiple sender system sending data to a receiver, if the senders do not coordinate which data they will transmit and when, but instead just send segments of the file, it is likely that a receiver will receive many useless duplicate segments. Similarly, where different receivers join a transmission from a sender at different points in time, a concern is how to ensure that all data the receivers receive from the sender is useful. For example, suppose the sender wishes to transmit a file, and is continuously transmitting data about the same file. If the sender just sends segments of the original file and repeats and some segments are lost, it is likely that a receiver will receive many useless duplicate segments before receiving one copy of each segment in the file. Similarly, if a segment is received in error multiple times, then the amount of information conveyed to the receiver is much less than the cumulative information of the received garbled data. Often this leads to undesirable inefficiencies of the transmission system.
Often data to be transmitted over a communications channel is partitioned into equal size input symbols. The “size” of an input symbol can be measured in bits, whether or not the input symbol is actually broken into a bit stream, where an input symbol has a size of M bits when the input symbol is selected from an alphabet of 2M symbols or other alphabet with other than 2M symbols for an integer M.
A coding system may produce output symbols from the input symbols. Output symbols are elements from an output symbol alphabet. The output symbol alphabet may or may not have the same characteristics as the alphabet for the input symbols. Once the output symbols are created, they are transmitted to the receivers.
The task of transmission may include post-processing of the output symbols so as to produce symbols suitable for the particular type of transmission. For example, where transmission constitutes sending the data from a wireless provider to a wireless receiver, several output symbols may be lumped together to form a frame, and each frame may be converted into a wave signal in which the amplitude or the phase is related to the frame. The operation of converting a frame into a wave is often called modulation, and the modulation is further referred to as phase or amplitude modulation depending on whether the information of the wave signal is stored in its phase or in its amplitude. Nowadays, this type of modulated transmission is used in many applications, such as wireless transmission, satellite transmission, cable modems, Digital Subscriber Lines (DSL), and many others.
A transmission is called reliable if it allows the intended recipient to recover an exact copy of the original data even in the face of errors and/or erasures during the transmission. Recovery of erased information has been the subject of much literature and very efficient coding methods have been devised in this case.
One solution that has been proposed to solve the transmission problem is to use Forward Error-Correction (FEC) codes, such as Reed-Solomon codes, Tornado codes, or, more generally, LDPC (low-density parity-check) codes, or Turbo codes to increase reliability. With LDPC coding, sent output symbols are generated from the content instead of just the input symbols that constitute the content. Traditional error correcting codes, such as Reed-Solomon, LDPC, or Turbo codes, generate a fixed number of output symbols for a fixed length content. For example, for K input symbols, N output symbols might be generated. These N output symbols may comprise the K original input symbols and N-K redundant symbols. If storage permits, the sender can compute the set of output symbols for each piece of data only once and transmit the output symbols using a carousel protocol.
One problem with some FEC codes is that they require excessive computing power or memory to operate. Another problem is that the number of output symbols must often be determined in advance of the coding process. This can lead to inefficiencies if the error rate of the symbols is overestimated and can lead to failure if the error rate is underestimated. As a result, traditional FEC schemes often require a mechanism to estimate the reliability of the communications channel on which they operate. For example, in a wireless transmission system, the sender and the receiver might need to probe a communications channel so as to obtain an estimate of the noise and hence of the reliability of the channel. In such a case, this probing has to be repeated quite often, since the actual noise is a moving target due to rapid and transient changes in the quality of the communications channel.
For traditional FEC codes, the number of possible output symbols that can be generated is of the same order of magnitude as the number of input symbols the content is partitioned into. Typically, but not exclusively, most or all of these output symbols are generated in a preprocessing step before the sending step. These output symbols have the property that all the input symbols can be regenerated from any subset of the output symbols which, in aggregate, have the same amount of information as the original content.
As discussed above, one problem with many error-correcting codes is that they require excessive computing power or memory to operate. One coding scheme recently developed for communications applications that is somewhat efficient in its use of computing power and memory is the LDPC coding scheme. LDPC codes are similar to Reed-Solomon codes in that input data is represented by K input symbols and is used to determine N output symbols, where N is fixed before the encoding process begins. Encoding with LDPC codes is generally much faster than encoding with Reed-Solomon codes, as the average number of arithmetic operations required to create the N LDPC output symbols is proportional to N (on the order of tens of assembly code operations times N) and the total number of arithmetic operations required to decode the entire data is also proportional to N.
LDPC codes have speed advantages over Reed-Solomon codes. However, both LDPC and Reed-Solomon codes have several disadvantages. First, the number of output symbols, N, must be determined in advance of the coding process. This leads to inefficiencies if the error rate of symbols is overestimated, and can lead to failure if the error rate is underestimated. This is because an LDPC decoder requires reception of a certain number of output symbols to decode and restore the original data and if the number of erased symbols is greater than what the code was provisioned for, then the original data cannot be restored. This limitation is generally acceptable for many communications problems, so long as the rate of the code is selected properly, but this requires an advance guess at the error rate of the symbol reception of the channel.
Another disadvantage of LDPC codes is that they require the encoder and decoder to agree in some manner on a graph structure. LDPC codes require a pre-processing stage at the decoder where this graph is constructed, a process that may slow the decoding substantially. Furthermore, a graph is specific to a data size, so a new graph needs to be generated for each data size used. Furthermore, the graphs needed by the LDPC codes are sometimes complicated to construct, and require different custom settings of parameters for different sized data to obtain the best performance. These graphs may be of significant size and may require a significant amount of memory for their storage at both the sender and the recipient.
In addition, LDPC codes generate exactly the same output symbol values with respect to a fixed graph and input data. These output symbols may comprise the K original input symbols and N-K redundant symbols. Furthermore, values of N greater than a small multiple of K, such as 3 or 4 times K, are not practical. Thus, it is very likely that a recipient obtaining output symbols generated from the same input data using the same graph from more than one sender will receive a large number of duplicate output symbols, which would not be information additive. That is because 1) the N output symbols are fixed ahead of time, 2) the same N output symbols are transmitted from each transmitter each time the symbols are sent, 3) the same N symbols are received by a receiver and 4) N cannot practically exceed some small multiple of K. In effect, if uncoordinated output symbols are received from a number of transmitters, the probability that some output symbol has already been received is of the order of 1/sqrt(N), where sqrt(N) denotes the square root of N. Where K is on the order of N and K output symbols are needed, as more output symbols are received it becomes less and less likely that the next received output symbol would be information additive, which would not be the case if the number of possible output symbols were much larger than the number of output symbols needed to be received to decode the data.
Even though the output symbols from different transmitters may be corrupted in different ways, the total amount of information they convey to the system is not the sum of their respective amounts of information. For example, suppose that the symbols are one bit long, and the same LDPC code bit is received by a receiver from two different sources (such as two satellites), and that both bits have a probability p of being corrupt. Further suppose that one of the bits is received as 0, while the other one is received as 1. Then, the bits together do not give any information about the original LDPC bit, since the state of that bit is 0 or 1 each with probability of 50%. Each individual bit, however, gives some information about the original bit, but this information is not additive.
Therefore, what is needed is a simple error-correcting code that does not require excessive computing power or memory at a sender or recipient to implement, and that can be used to efficiently distribute data in a system with one or more senders and/or one or more recipients without necessarily needing coordination between senders and recipients.