Information spreading plays a central role in human society. In the information age, how to efficiently and reliably spread information with low delay may be critical for numerous activities involving humans and machines, e.g., the spread of tweets in Twitter, the dissemination of data collected by wireless sensor networks, and delivery of Internet TV. With information spreading over lossy communication channels (also known as erasure channels), a problem exists wherein packets may be lost/discarded due to bit errors or buffer overflow. For wired networks over optical fiber channels, the bit error rate can be as low as 10−12, and so packet loss is mainly due to overflow of the buffers at routers rather than bit errors. For wireless channels, packet loss is mainly due to uncorrectable bit errors, which may be caused by fading, shadowing, interference, path loss, noise, etc. [17]. At the transmitter, most physical layer schemes encode messages by both an error-detecting code such as cyclic redundancy check (CRC) and an error-correction code. At the receiver, a received packet is first decoded by the error-correction decoder. If the resulting packet has uncorrectable bit errors, it will not pass the check of the error-detecting module. Most physical layer designs will drop those packets that have uncorrectable bit errors.
To address the packet loss problem, three approaches can be used [12, 22]. The first approach is retransmission or Automatic Repeat reQuest (ARQ). The Transmission Control Protocol (TCP) uses ARQ for packet loss recovery. An advantage of ARQ is its adaptation to time-varying channel condition (e.g., network congestion status or signal-to-interference-plus-noise ratio (SINR)) in the following manner: when the channel condition is very good (e.g., no congestion or very high SINR) and induces no packet loss, no retransmission is needed; the poorer the channel condition, the more lost packets, resulting in more retransmitted packets. Hence, ARQ is suitable for the case that the transmitter has little knowledge of the channel condition. A disadvantage of ARQ is that it is not suitable for reliable multicast communication due to the feedback implosion problem.
The second approach is Forward Error Correction (FEC). FEC is achieved by channel coding, which includes Reed-Solomon codes, convolutional codes, turbo codes, Low-Density Parity-Check (LDPC) code, and fountain codes (or rateless codes). FEC is suitable for the case in which the transmitter has good knowledge of SINR of the channel; in this case, the transmitter chooses a channel code with appropriate code rate (which depends on network congestion status or the SINR) so that the receiver can recover all the native packets without retransmission. If the transmitter has little knowledge of the channel condition, it does not know how much redundancy (parity bits) should be added to the coded packets: adding too much redundancy wastes communication resource while adding too little redundancy makes lost packets unrecoverable by the receiver. Another advantage of FEC is that it is well suited for reliable multicast applications since FEC does not need feedback and hence does not suffer from the feedback implosion problem like ARQ.
The third approach is hybrid ARQ, which combines certain features of ARQ and FEC. Under hybrid ARQ, the transmitter does not need to have knowledge of the channel condition. For non-erasure error-prone channels, there are two types of hybrid ARQ: Type I and Type II. Under Type I hybrid ARQ, the transmitter transmits a coded packet with error correction and error detection capability; if a packet has uncorrectable errors, the receiver sends a retransmission request to the transmitter and the transmitter transmits another coded packet.
Under Type II hybrid ARQ, the transmitter first transmits a coded packet with error detection capability only. If the packet has errors, the receiver sends a retransmission request and the transmitter transmits a new coded packet. Combining the previously coded packet and the newly coded packet forms a longer codeword, which has better error correction capability than a single packet. The receiver jointly decodes the previously received packet and newly received packet. If the errors are still uncorrectable, the receiver sends another retransmission request, and this process continues until the errors are corrected.
However, Type I and Type II hybrid ARQ are not suitable for erasure channels since lost packets cannot be treated as correctable errors in received packets. For erasure channels, the transmitter can use a fountain code and keep transmitting coded packets until the receiver is able to decode all the native packets of a file or a batch (here, a batch is a group of native packets, specified by a user or an application). The transmitter does not need knowledge of the channel condition, and it is able to adapt to the channel condition. Since a fountain code has no fixed code rate, it is called rateless code and is well suited for time-varying wired/wireless channel conditions and multicast applications with heterogeneous receivers. Network coded TCP (CTCP) [10] can be regarded as a hybrid ARQ approach for erasure channels.
Related Work
Erasure Codes
Packets may be dropped due to 1) congestion at a router or 2) uncorrectable bit errors, which may be caused by fading, shadowing, interference, path loss, or noise in a wireless channel. The packet loss rate in some real-world wireless networks can be as high as 20-50% [1].
Erasure codes can be used to recover native packets without feedback and retransmission. Under linear erasure coding, the coded packets are generated by linearly combining the native packets with coefficients from a finite field (Galois field) q, where q=2i (i∈) and  is the set of natural numbers (i.e., positive integers). Erasure codes can be nonlinear [16] (for example, triangular codes [15]). A nonlinear erasure code can reduce the computational complexity by using only binary addition and shift operations instead of more complicated finite field multiplications as in a linear erasure code. Hence, nonlinear erasure codes may be particularly suited for mobile phone applications, which require low computational complexity and low power consumption.
Under an erasure code, a receiver can recover K native packets from n coded packets (received by the receiver), where n=(1+ε)K and ε can be very small, e.g., ε can be as small as 10−6 for RQ codes [19]. To recover the K native packets, it does not matter which packets the receiver has received; as long as it has received any K linearly independent packets, the receiver is able to decode the K native packets. FIG. 2 illustrates an exemplary erasure channel and erase channel coding.
Erasure codes include Reed-Solomon codes, LDPC codes, and fountain codes [16]. According to [19], an erasure code can be classified as a fountain code if it has the following properties:                (Ratelessness) The number of coded packets that can be generated from a given set of native packets should be sufficiently large. The reason why this code is called fountain code is because the encoder generates an essentially unlimited supply of codewords, in analogy to a water fountain, which produces essentially unlimited drops of water [16].        (Efficiency and flexibility) Irrespective of which packets the receiver has received, the receiver should be able to decode K native packets using any K linearly independent received coded packets. Just like an arbitrary collection of water drops will fill a glass of water and quench thirst, irrespective of which water drops had been collected, a collection of any K linearly independent fountain-coded packets will be sufficient for the receiver to decode K native packets [16].        (Linear complexity) The encoding and decoding computation cost should be a linear function of the number of native packets K.        
Network Coding
Simply forwarding packets may not be an optimal operation at a router from the perspective of maximizing throughput. Network coding was proposed to achieve maximum throughput for multicast communication [2]. Network coding techniques can be classified into two categories: intra-session (where coding is restricted to the same multicast or unicast session) [2, 7, 11] and inter-session (where coding is applied to packets of different sessions) [9, 18, 24]. The pioneering works on intra-session network coding include [2, 7, 11]; all these intra-session network coding techniques apply to multicast only. In [2], Ahlswede et al. showed that in a single-source multicast scenario, instead of simply forwarding the packets they receive, relay nodes can use network coding—i.e., mixing packets destined to different destinations—to achieve multicast capacity, which is higher than that predicted by the max-flow-min-cut theorem. In [11], Li et al. proved that linear network coding is enough to achieve the multicast capacity for many cases. In [7], Ho et al. introduced random linear network coding for a distributed implementation of linear network coding with low encoding/decoding cost. Examples of inter-session network coding schemes include [6, 9, 18, 24].
For wireless communication, cross-next-hop network coding [9, 18] and intra-session network coding [13, 25, 26] are often used. Under cross-next-hop network coding, a relay node applies coding to packets destined to different next-hop nodes. Cross-next-hop network coding is a special type of inter-session network coding. Cross-next-hop network coding uses per-next-hop queueing at each relay node while inter-session network coding may use per-flow queueing at each relay node or add a very large global encoding vector to the header of each coded packet [6]. Hence, cross-next-hop network coding is more scalable than a general inter-session network coding. As such, for core routers, it may be desirable to use cross-next-hop network coding instead of a general inter-session network coding.
Cross-next-hop network coding has been heavily studied in the wireless networking area. The major works include [9, 18]. In [9], Katti et al. proposed an opportunistic network coding scheme for unicast flows, called COPE, which can achieve throughput gains from a few percent to several folds depending on the traffic pattern, congestion level, and transport protocol. In [18], Rayanchu el al. developed a loss-aware network coding technique for unicast flows, called CLONE, which improves reliability of network coding by transmitting multiple copies of the same packet, similar to repetition coding [12].
Intra-session network coding has been used in combination with a random linear erasure code for unicast/multicast communication in [13, 25, 26]. Note that intra-session network coding should not be used alone at relay nodes (without the aid of erasure coding/decoding at the source/destination); otherwise, the performance will be very poor, i.e., the source needs to send much more redundant (duplicate) packets for the receiver to recover all the native packets, compared to joint erasure coding and intra-session network coding.
Joint Erasure Coding and Intra-Session Network Coding (JEN) and BATched Sparse (BATS)
The following Joint Erasure coding and intra-session Network coding (JEN) approach has been used for unicast/multicast communication in [13]: The source node uses random linear erasure coding (RLEC) to encode the native packets and add a global encoding vector to the header of each coded packet. A relay node uses random linear network coding (RLNC) to re-code the packets it has received (i.e., the relay node generates a coded packet by randomly linearly combining the packets that it has received and stored in its buffer); the relay node also computes the global encoding vector of the re-coded packet and adds the global encoding vector to the header of the re-coded packet. A destination node can decode and recover K native packets as long as it receives enough coded packets that contain K linearly independent global encoding vectors. Note that under the JEN approach, a relay node does not decode the packets received by this relay node; packets are only decoded by the destination node. Hence, JEN takes an end-to-end erasure coding approach, which is different from the hop-by-hop erasure coding approach that applies erasure encoding and decoding to each link/hop. It has been proved that JEN can achieve the multicast capacity for lossy networks in a wide range of scenarios [13, 23].
JEN has two control parameters: density and non-aggressiveness [21]. The ratio of the number of non-zero entries in the encoding vector to the total number of entries in the encoding vector is called the density of the code. Lower density corresponds to less computational complexity, but it also corresponds to a lower network-coding-gain. The ratio of the number of packets participating in computing a coded packet at a relay node to the total number of packets transmitted by the source node, is called non-aggressiveness (or patience) of the relay node. The smaller the value of non-aggressiveness/patience, the more aggressive the relay node is (or the less patient the relay node is). In other words, the relay node waits for a shorter time in buffering incoming packets for RLNC, which translates to a smaller end-to-end delay. Still, the smaller value of non-aggressiveness may result in a lower network-coding-gain since less packets participate in computing a coded packet.
In practice, under JEN, the data to be transmitted is partitioned into multiple segments [21] (or generations [4], blocks [14], or batches [3]), and coding is restricted within the same segment. In doing so, the encoding vector is small enough to be put into the header of a coded packet. Silva et al. proposed a network coding technique with overlapping segments [20] to improve the performance of JEN with non-overlapping segments.
BATched Sparse (BATS) codes have also been proposed [25, 26]. A BATS code consists of an inner code and an outer code over a finite field q. The outer code is a matrix generalization of a fountain code. At a source node, the outer code encoder encodes native packets into batches, each of which contains M packets. When the batch size M is equal to 1, the outer code reduces to a fountain code. The inner code is an RLNC performed at each relay node. At each relay node, RLNC is applied only to the packets within the same batch of the same flow; hence the structure of the outer code is preserved.
Specifically, for unicast, a BATS coding approach works as follows. The source node uses a matrix-form fountain code as its outer code, where the matrix consists of a batch of M packets and each column of the matrix corresponds to a packet—in contrast to the original fountain code [19], which uses a vector form (i.e., a packet consisting of multiple symbols), and to JEN, which uses a RLEC—to encode the native packets and add an encoding vector of M×log2 q bits to the header of each coded packet (where q is the size of the finite field of the coding coefficients). A relay node uses RLNC to re-code all the received packets of the same batch, i.e., the relay node generates M coded packets by randomly linearly combining all the packets that it received for the same batch. The relay node also computes the encoding vector of the re-coded packet and adds the encoding vector to the header of the re-coded packet. A destination node can decode and recover K native packets as long as it receives enough coded packets.