Transmission of files and streams between a sender and a recipient over a communications channel has been the subject of much literature. Preferably, a recipient desires to receive an exact copy of data transmitted over a channel by a sender with some level of certainty. Where the channel does not have perfect fidelity (which covers most all physically realizable systems), one concern is how to deal with data lost or garbled in transmission. Lost data (erasures) are often easier to deal with than corrupted data (errors) because the recipient cannot always tell when corrupted data is data received in error. Many error-correcting codes have been developed to correct for erasures and/or for errors. Typically, the particular code used is chosen based on some information about the infidelities of the channel through which the data is being transmitted and the nature of the data being transmitted. For example, where the channel is known to have long periods of infidelity, a burst error code might be best suited for that application. Where only short, infrequent errors are expected, a simple parity code might be best.
Data transmission is straightforward when a transmitter and a receiver have all of the computing power and electrical power needed for communications and the channel between the transmitter and receiver is clean enough to allow for relatively error-free communications. The problem of data transmission becomes more difficult when the channel is in an adverse environment or the transmitter and/or receiver has limited capability.
One solution is the use of forward error correcting (FEC) techniques, wherein data is coded at the transmitter such that a receiver can recover from transmission erasures and errors. Where feasible, a reverse channel from the receiver to the transmitter allows the receiver to communicate about errors to the transmitter, which can then adjust its transmission process accordingly. Often, however, a reverse channel is not available or feasible. For example, where the transmitter is transmitting to a large number of receivers, the transmitter might not be able to handle reverse channels from all those receivers. As a result, communication protocols often need to be designed without a reverse channel and, as such, the transmitter may have to deal with widely varying channel conditions without a full view of those channel conditions.
The problem of data transmission between transmitters and receivers is made more difficult when the receivers need to be low-power, small devices that might be portable or mobile and need to receive data at high bandwidths. For example, a wireless network might be set up to deliver files or streams from a stationary transmitter to a large or indeterminate number of portable or mobile receivers either as a broadcast or as multicast where the receivers are constrained in their computing power, memory size, available electrical power, antenna size, device size, and other design constraints.
In such a system, considerations to be addressed include having little or no reverse channel, limited memory, limited computing cycles, mobility, and timing. Preferably, the design should minimize the amount of transmission time needed to deliver data to potentially a large population of receivers, where individual receivers might be turned on and off at unpredictable times, move in and out of range, incur losses due to link errors, cell changes, congestion in cells forcing lower priority file or stream packets to be temporarily dropped, etc.
In the case of a packet protocol used for data transport, a file, stream, or other data to be transmitted over a packet network is partitioned into blocks larger than the payload of a packet. The blocks can be divided into equal size input symbols to be encoded. The “size” of an input symbol can be measured in bits, whether or not the input symbol is actually broken into a bit stream, where an input symbol has a size of M bits when the input symbol is selected from an alphabet of 2M symbols. In such a packet-based communication system, a packet oriented coding scheme might be suitable. A file transmission is called reliable if it allows the intended recipient to recover an exact copy of the original file even in the face of erasures in the network. A stream transmission is called reliable if it allows the intended recipient to recover an exact copy of each part of the stream in a timely manner even in the face of erasures in the network. Both file transmission and stream transmission can also be somewhat reliable, in the sense that some parts of the file or stream are not recoverable or for streaming, if some parts of the stream are not recoverable in a timely fashion. Packet loss often occurs because sporadic congestion causes the buffering mechanism in a router to reach its capacity, forcing it to drop incoming packets. Protection against erasures during transport has been the subject of much study.
It is known to use chain reaction codes to allow for generation of an arbitrary number of encoded symbols (also called output symbols) from the input symbols of a file or stream. This has many uses, including the generation of encoded symbols in an information additive way, as opposed to an information duplicative way, wherein the latter is where a receiver receives additional data that duplicates data the receiver already knows. Novel techniques for generating, using and operating chain reaction codes are shown, for example, in U.S. Pat. No. 6,307,487 entitled “Information Additive Code Generator and Decoder for Communication Systems” issued to Luby (“Luby II”), U.S. Pat. No. 6,320,520 entitled “Information Additive Group Code Generator and Decoder for Communication Systems” issued to Luby et al. (hereinafter “Luby III”), and U.S. Pat. No. 7,068,729 entitled “Multi-Stage Code Generator and Decoder for Communication Systems” issued to Shokrollahi et al. (hereinafter “Shokrollahi”). The entire disclosures of these patents are herein incorporated by reference for all purposes.
One property of the encoded symbols (also called output symbols) produced by a chain reaction encoder is that a receiver is able to recover the original file or block of the original stream as soon as enough encoded symbols have been received. Specifically, to recover the original K input symbols with a high probability, the receiver needs approximately K+O encoded symbols. The ratio O/K is called the “relative reception overhead.” The relative reception overhead depends on the number K of input symbols, and on the reliability of the decoder. For example, in one specific embodiment, and where K is equal to 60,000, a relative reception overhead of 5% ensures that the decoder successfully decodes the input file or block of the stream with a probability of at least 1−10−8; and where K is equal to 10,000, a relative reception overhead of 15% ensures the same success probability of the decoder. In one embodiment, the relative reception overhead of chain reaction codes can be computed as (13*sqrt(K)+200)/K, where sqrt(K) is the square root of the number of input symbols K. In this embodiment, the relative reception overhead of chain reaction codes tends to be larger for small values of K.
As another example, the RaptorQ code described in IETF RFC 6330 has the property that if the reception overhead is 0, 1 or 2 symbols, then the decode failure probability of the code is 0.01, 0.0001, and 0.000001, respectively.
Luby II, Luby III, and Shokrollahi provide teachings of systems and methods that can be employed in certain embodiments described in this disclosure. It is to be understood, however, that the Luby II, Luby III, and Shokrollahi systems and methods are not required of the apparatus, systems, and methods described in this disclosure, and many other variations, modifications, or alternatives can also be used.
It is also known to use multi-stage chain reaction (“MSCR”) codes, such as those described in Shokrollahi and developed by Digital Fountain, Inc. under the trade name “Raptor” codes. Multi-stage chain reaction codes are used, for example, in an encoder that receives input symbols from a source file or source stream, generates intermediate symbols therefrom and encodes the intermediate symbols using chain reaction codes. More particularly, a plurality of redundant symbols is generated from an ordered set of input symbols to be transmitted. A plurality of encoded symbols are generated from a combined set of symbols including the input symbols and the redundant symbols, wherein the number of possible encoded symbols is much larger than the number of symbols in the combined set of symbols, wherein at least one encoded symbol is generated from more than one symbol in the combined set of symbols and from less than all of the symbols in the combined set of symbols, and such that the ordered set of input symbols can be regenerated to a desired degree of accuracy from any predetermined number of the encoded symbols.
For some applications, other variations of codes might be more suitable or otherwise preferred. For example, Reed-Solomon codes might be suitable for some applications, as they provide optimal recovery of a source block from received encoded symbols without any data reception overhead, i.e., reception of any k encoded symbols for a source block with k source symbols allows recovery when a Reed-Solomon code is used, and thus the symbol overhead is zero. FEC codes, such as Reed-Solomon codes, that have zero symbol reception overhead are called “Maximum Distance Separable” (MDS) codes. However, MDS codes such as Reed-Solomon codes can become computationally intensive when used with larger values of k, e.g., values of k that is a few tens or larger, and thus may not be suitable for usage in applications that require larger values of k. On the other hand, MSCR codes such as RaptorQ codes can be used with larger values of k, e.g., values of k that range up to many tens of thousands. However, these are not MDS codes, and applications using non-MDS FEC codes can experience larger than desired data reception overhead if the non-MDS FEC codes are used in a straightforward way.