Transmission of data through impaired networks has been the subject of much investigation. On many networks of computers, such as the Internet, or any other packet-based network, data is transmitted by first subdividing it into packets and then routing the packets independently through the network to the destination. In such a network there is often an expectation of loss of packets. Packets might be lost due to errors on the physical layer of transmission, due to overflow at a router or other network point causing equipment to drop packets, etc. To ensure that data is received completely, mechanisms are often used to protect the data from such losses. In general, the unit of loss is the packet, in that a packet is either received properly or it is deemed lost and steps are taken to deal with the loss of the entire packet. Thus, if bits of a packet are received but the packet is not completely received correctly, the entire packet is deemed lost. Loss can be in the form of missing a packet entirely or can be in the form of determining that there are errors in the packet creating unreliable bits, i.e., erasures and errors.
Recently, two types of codes were suggested to protect the data when there is an expectation that data would be lost during the transmission: chain reaction codes and multi-stage chain reaction codes. For a given content with k symbols, these codes produce an effectively unlimited stream of output symbols such that recovery of the original k symbols is possible from reception of any set of distinct output symbols whose cumulative number is roughly equal to k. Unless otherwise indicated, it should be understood that references to a “chain reaction code” or “chain reaction codes” as used herein could apply to chain reaction codes, such as those described in Luby I and/or elsewhere, and could also apply to multi-stage chain reaction codes, such as those described in Shokrollahi I.
With chain reaction codes, the number of output symbols possible for a given set of k input symbols input is said to be “effectively unlimited” because in nearly all cases, the number of possible output symbols can be so large relative to he number of output symbols that actually get generated or are used for input symbol recovery is much less than the number of possible symbols. For example, if input symbols code for 10,000 bits and the typical expected transmissions are files or streams up to 10 gigabit in size, an encoder should be designed to handle inputs of k=1,000,000 symbols. Such an encoder might be configured to be able to generate up to 232 (4 billion) output symbols without having to repeat. If that is not enough, the encoder can be configured to be able to generate more output symbols without having to repeat. Of course, since all physically realizable systems are finite, an encoder will eventually reach a state where it repeats, but that state can always be designed such that, for any expected transmission and error rate, the number of output symbols without repeating is effectively unlimited.
Herein, packets can carry one symbol or multiple symbols. While it is not required, the number of bits coded for in an input symbol and the number of bits coded for in an output symbol can be the same.
In some embodiments, these codes encode the data by performing XOR's on the input symbols and they decode by performing XOR's on the received symbols, but other operations might be used as well or instead. XOR is a useful operation, as it is quick and reversible. Other operations might also provide these advantages.
These codes solve the problem of distributing data from one or more senders to one or more receivers on an impaired network in which the loss rate is unknown to the sender or to the receiver. One reason for this is this, with the large number of output symbols possible relative to the number of input symbols, a receiver would, with overwhelming odds, not duplicate the packets sent by another receiver even without any coordination among the receivers. This property is referred to as the receivers being “information additive”.
In some cases, it may not be necessary or desirable to produce an effectively unlimited number of output symbols from the given content. For example, where a receiver is time constrained, it may not have the luxury of waiting for additional symbols to arrive after a given time interval. Such is the case, for example, when a live movie is sent to one or multiple receivers. Due to the nature of the live transmission, it may be impossible to always wait for enough encoding data to arrive at the receiver, because the receiver's feed has to be synchronized with that of the sender and cannot be interrupted indefinitely. In such cases, where there is expectation of loss, the sender may add a fixed additional amount of redundant symbols to the content, and transmit the content together with the redundant symbols. If the amount of loss during the transmission of the content is no larger than the number of redundant symbols, then there is an expectation of recovery of the lost data at the receiver.
This problem can also be solved with chain reaction codes. In such cases, the encoder only generates a fixed amount of encoded data, rather than an effectively unlimited stream. However, in some cases a different solution may be preferable. For example, due to the probabilistic nature of the decoding processes for chain reaction codes, these processes may incur some additional overhead for very small content sizes.
Reed-Solomon codes (“RS codes”) are a class of codes that have been used to deal with transmission or storage of data that is subject to erasures between a coder output and a decoder input. Throughout this disclosure, it should be understood that coding is not limited to transmission, but of representing original data at an encoder separated in time, place, etc., from a decoder by a channel that might exhibit erasures and/or errors as the encoded data passes through the channel. RS codes have been extensively studied by a large number of researchers for many conditions, data and channels, and they are known to have certain properties.
One such condition is what is described as an “optimality condition”. RS codes do not operate on binary fields but rather operate on larger Galois Fields. One of the basic properties of RS codes is that they satisfy an optimality condition such that when k symbols are encoded with an RS code, yielding n<k symbols for storage or transmission, the original k symbols can be recovered with certainty from any possible combination of k distinct received symbols of the encoded n symbols. Since the original k symbols cannot be recovered from fewer than k distinct received symbols, the number of received symbols is thus considered “optimal”.
This optimality comes at a price, in that the number of operations required for encoding is large and grows larger with longer codes (i.e., with larger Galois Fields). With RS codes, a maximal block length, n, is determined ahead of time, where the block length is the number of output symbols generated from the original k input symbols. Note that if more than n-k output symbols are lost, the original k input symbols cannot be recovered. The block length, n, cannot be arbitrarily lengthened to deal with any expected condition, as computation becomes more difficult for larger block lengths and is impractical for very large block lengths.
It can be shown that, for a Reed-Solomon code defined over the Galois Field GF(2A) with block length n and dimension k, the number of XOR's of symbols to produce an output symbol is, on average, equal to k*(n−k)*A/(2*n). Using such a Reed-Solomon code, k input symbols are used to produce in total n output symbols, where typically the k input symbols are included among the n output symbols and n is greater than k. In contrast, when using a chain reaction code, the average number of XOR's of symbols to produce an output symbol is equal to a constant independent of k or the number of produced output symbols. Similar results also hold for the decoder.
The length n of the Reed-Solomon code cannot exceed 2A+1. This latter condition, together with the fact that A is often chosen to be a power of two, may slow down the encoding and the decoding process considerably at times. For example, suppose that the original content is 32 KB in size (where 1 KB=1024 bytes), each packet can encode for encodes for 1 KB of input data and a total of 48 packets are to be sent. In this example, the content might be partitioned into thirty-two 1 KB chunks (each allocated to one packet to be sent), and then each chunk might be further subdivided into X input symbols. The Reed-Solomon coding process can then be applied in parallel X times, each time operating on one input symbol from each chunk (such as operating on all of the first input symbols of each chunk, then the second input symbol of each chunk, etc.), meaning that each operation takes into account thirty-two input symbols. Suppose this produces sixteen additional output symbols for each of the X positions, and each group of X output symbols are placed together to produce 16 additional packets that are to be sent, each of length 1 KB. In this example, the smallest acceptable A that is a power of 2 would be A=8, because for A=4 we would have 2A+1=17, which is less than 48. The Reed-Solomon code in this case operates in the field GF(256), and thus each symbol is one byte long and X=1024. As shown by this example, while these codes might satisfy the optimality condition, they require considerable computation and have constraints on the length of codes possible.
A few concepts of coding bear introduction. Transmission granularity refers to the size of the objects transmitted and received as a unit. For example, packet networks send and receive data in packets. Even if only some of the bits of a packet are erased or corrupted, the whole packet is discarded and mechanisms (forward error correction, request for resend, etc.) are activated to recover the packet as a whole. Thus, such objects are either received error-free or are erased in their entirety. In some applications, the object size could be the size of the transmission packets or could be smaller. Where there is an expectation of correlation of loss between transmission packets, the transmission granularity can be larger than the packet size. In other applications, the transmission granularity could be smaller than the packet size.
Computational granularity refers to the size of the objects operated upon in encoders and/or decoders. Thus, if the basic operation of an encoder is XOR'ing 128-byte units, then that is the computational granularity. A symbol (which might be a packet, for example) comprising 1024 bytes sub-divided into 128-byte subsymbols would be a symbol divided into eight subsymbols (if all of the subsymbols are of the same size, which might not be required, but is simpler) and XOR's are performed on these subsymbols. The computational granularity is thus 128-bytes.
One of the reasons for the optimality of Reed-Solomon codes is in a relation between their transmission granularity and their computational granularity. An example will illustrate this point.
Consider a Reed-Solomon code over the field GF(256) that is used to encode a given file and transmit the encoded information through a channel in packets of size 1024 bytes each. The computational granularity in this case could be equal to 128 bytes (1024 bytes divided by 8), whereas the transmission granularity equals 1024 bytes. In this case, basic operations such as the XOR of sequences of bits are performed on 128 byte units as a whole.
Typically, efficiency of encoding and decoding varies with computational granularity. Efficiency can be measured in many ways, but one way of measuring it is by the average number of operations to encode or decode units of data. Often, encoding and decoding is less efficient for finer computational granularity and more efficient for coarser computational granularity. However, codes with finer computational granularity can provide better reception overhead, i.e., the excess of the number of symbols that need to be received to ensure correct decoding over the number of symbols representing the data provided to the encoder can be kept very small. As a result, there is a trade off between coding efficiency and transmission overhead for a given code.
Reed-Solomon codes are at one end of this coding trade-off, as computational granularity small enough that optimal recovery of data in face of erasures is guaranteed (upon receipt of as much data as was encoded. At the other end, codes defined over the binary alphabet (such as those used for transmission over packet networks) have a computational granularity as large as the transmission granularity, but might be inefficient in the reception overhead required to ensure complete decoding.
As mentioned above, Reed-Solomon codes require that a maximal error rate be determined in advance, i.e., if k symbols are encoded into n RS-symbols, an error rate of greater than (n−k)/n would cause a decoder to fail to recover the transmitted data. Thus, in a transmission system that is measured by the final probability of unsuccessful recovery of the transmitted data, Reed-Solomon codes exhibit a positive failure probability despite their optimality. This is because there is a positive probability that the amount of data received by the receiver is genuinely smaller than the transmitted data. As a result, in the end, a coding system might have less efficient coding and still have a failure probability that needs to be lowered.
What is therefore needed is a coding system and methods for encoding and decoding data sent through a channel wherein computational effort and overhead efficiency can be traded off as needed for particular applications, available processing power and data sets.