The present invention relates to the problem of rapid transmission of data between end systems over a data communication network.
Many data communication systems and high level data communication protocols offer the convenient communication abstractions of reliable data transport, and provide rate control, i.e., they automatically adjust their packet transmission rate based on network conditions. Their traditional underlying implementations in terms of lower level packetized data transports, such as the ubiquitous Transport Control Protocol (TCP), suffer when at least one of the following conditions occurs: (a) the connection between the sender(s) and the receiver(s) has a large round-trip time (RTT); (b) the amount of data is large and the network suffers from bursty and transient losses.
One of the most widely used reliable transport protocols in use today is the Transport Control Protocol (TCP). TCP is a point-to-point packet control scheme in common use that has an acknowledgment mechanism. TCP works well for one-to-one reliable communications when there is little loss between the sender and the recipient and the RTT between the sender and the recipient is small. However, the throughput of the TCP drops drastically when there is even very little loss, or when the sender and the recipient are far apart.
Using TCP, a sender transmits ordered packets and the recipient acknowledges receipt of each packet. If a packet is lost, no acknowledgment will be sent to the sender and the sender will resend the packet. With protocols such as TCP, the acknowledgment paradigm allows packets to be lost without total failure, since lost packets can just be retransmitted, either in response to a lack of acknowledgment or in response to an explicit request from the recipient.
TCP provides both reliability control and rate control, i.e., it ensures that all of the original data is delivered to receivers and it automatically adjusts the packet transmission rate based on network conditions such as congestion and packet loss. With TCP, the reliability control protocol and the rate control protocol are intertwined and not separable. Moreover, TCP's throughput performance as a function of increasing RTT and packet loss is far from optimal.
Studies by many researchers have shown that, when using TCP, the throughput of the data transfer is inversely proportional to the product of the RTT, and the square root of the inverse of the loss rate on the end-to-end connection. For example, a typical end-to-end terrestrial connection between the U.S. and Europe has an RTT of 200 milliseconds and an average packet loss of 2%. Under these conditions, the throughput of a TCP connection is at most around 300-400 Kilobits per second (kbps), no matter how much bandwidth is available end-to-end. The situation is more severe on a satellite link, where in addition to high RTTs, information is lost due to various atmospheric effects. A primary reason for TCP's poor performance in these types of conditions is that the rate control protocol used by TCP does not work well in these conditions, and since the reliability control protocol and rate control protocol used by TCP are inseparable, this implies that the overall TCP protocol does not work well in these conditions. Furthermore, the requirements of different applications for transport vary, yet TCP is used fairly universally for a variety of applications in all network conditions, thus leading to poor performance in many situations.
What would be desirable is if the reliability control and rate control protocols used by the overall transport protocol were independent, and then the same reliability control protocol could be used with a variety of different rate control protocols so the actual rate control protocol chosen can be based on application requirements and the network conditions in which the application is run. The paper “A Modular Analysis of Network Transmission Protocols”, Micah Adler, Yair Bartal, John Byers, Michael Luby, Danny Raz, Proceedings of the Fifth Israeli Symposium on Theory of Computing and Systems, June 1997 (hereinafter referred to as “Adler” and incorporated by reference herein), introduces a modular approach to building transport protocols that advocates partitioning a reliable transport protocol into independent reliability control and rate control protocols.
For any reliability control protocol, two primary measures of its performance are how much buffering is required and what is its “goodput.” Buffering is introduced in a reliability control protocol at both the sender and receiver. Buffering at the sender occurs, for example, when data is buffered after it is initially sent until the sender has an acknowledgement that it has been received at the receiver. Buffering at the receiver occurs for similar reasons. Buffering is of interest for two reasons: (1) it directly impacts how much memory the sender and receiver reliability control protocol uses; (2) it directly impacts how much latency the sender and receiver reliability control protocol introduces. Goodput is defined as the size of the data to be transferred divided by the amount of sent data that is received at the receiver end system during the transfer. For example, goodput=1.0 if the amount of data sent in packets to transfer the original data is the size of the original data, and goodput=1.0 can be achieved if no redundant data is ever transmitted.
Adler outlines a reliability control protocol that is largely independent of the rate control protocol used, which is hereafter referred to as the “No-code reliability control protocol”. The No-code reliability control protocol is in some ways similar to the reliability control protocol embedded in TCP, in the sense that the original data is partitioned into blocks and each block is sent in the payload of a packet, and then an exact copy of each block needs to be received to ensure a reliable transfer. An issue with the No-code reliability control protocol is that, although the goodput is optimal (essentially equal to one), the buffering that the No-code reliability control protocol introduces can be substantial when there is packet loss. Adler proves that the No-code reliability control protocol is within a constant factor of optimal among reliability control protocols that do not use coding to transport the data, in the sense that the protocol has optimal goodput and provably is within a constant factor of optimal in terms of minimizing the amount of buffering needed at the sender and receiver.
One solution that has been used in reliability control protocols is Forward Error-Correction (FEC) codes, such as Reed-Solomon codes or Tornado codes, or chain reaction codes (which are information additive codes.) Using FEC codes, the original data is partitioned into blocks larger than the payload of a packet and then encoding units are generated from these blocks and send the encoding units in packets. One basic advantage of this approach versus reliability control protocols that do not use coding is that the feedback can be much simpler and less frequent, i.e., for each block the receiver need only indicate to the sender the quantity of encoding units received instead of a list of exactly which encoding units are received. Furthermore, the ability to generate and send more encoding units in aggregate than the length of the original data block is a powerful tool in the design of reliability control protocols.
Erasure correcting codes, such as Reed-Solomon or Tornado codes, generate a fixed number of encoding units for a fixed length block. For example, for a block comprising B input units, N encoding units might be generated. These N encoding units may comprise the B original input units and N-B redundant units. If storage permits, then the sender can compute the set of encoding units for each block only once and transmit the encoding units using a carousel protocol.
One problem with some FEC codes is that they require excessive computing power or memory to operate. Another problem is that the number of encoding units needed must be determined in advance of the coding process. This can lead to inefficiencies if the loss rate of packets is overestimated, and can lead to failure if the loss rate of packets is underestimated.
For traditional FEC codes, the number of possible encoding units that can be generated is of the same order of magnitude as the number of input units a block is partitioned into. Typically, but not exclusively, most or all of these encoding units are generated in a preprocessing step before the sending step. These encoding units have the property that all the input units can be regenerated from any subset of the encoding units equal in length to the original block or slightly longer in length than the original block.
Chain reaction decoding described in U.S. Pat. No. 6,307,487 (hereinafter “Luby I” and incorporated by reference herein) can provide a form of forward error-correction that addresses the above issues. For chain reaction codes, the pool of possible encoding units that can be generated is orders of magnitude larger than the number of the input units, and a randomly or pseudo randomly selected encoding unit from the pool of possibilities can be generated very quickly. For chain reaction codes, the encoding units can be generated on the fly on an “as needed” basis concurrent with the sending step. Chain reaction codes allow that all input units of the content can be regenerated from a subset of a set of randomly or pseudo randomly generated encoding units slightly longer in length than the original content.
Other documents such as U.S. Pat. Nos. 6,320,520, 6,373,406, 6,614,366, 6,411,223, 6,486,803, and U.S. Patent Publication No. 20030058958 (hereafter referred to as “Shokrollahi I”), describe various chain reaction coding schemes and are incorporated herein by reference.
A sender using chain reaction codes can continuously generate encoding units for each block being sent. The encoding units may be transmitted via the User Datagram Protocol (UDP) Unicast, or if applicable UDP Multicast, to the recipients. Each recipient is assumed to be equipped with a decoding unit, which decodes an appropriate number of encoding units received in packets to obtain the original blocks.
One of the several transports available in the Transporter Fountain™ network device available from Digital Fountain is a reliable transport protocol that uses a simple FEC-based reliability control protocol that can be combined with a variety of rate control protocols. This simple FEC-based reliability control protocol is hereinafter referred to as the “TF reliability control protocol”. The TF reliability control protocol transmits encoding units for a given block of data until receiving an acknowledgement from the receiver that enough encoding units have been received to recover the block, and then the sender moves on to the next block.
Let RTT be the number of seconds it would take from when the sender sends a packet until the sender has received an acknowledgement from the receiver that the packet has arrived, and let R be the current sending rate of the sender in units of packets/second, and let B be the size of a block in units of packets. Using the TF reliability control protocol, the number of useless packets containing encoding units for a block sent subsequent to the last packet needed to recover the block is N=R*RTT. Thus, a fraction f=N/(B+N) of the packets sent are wasted, and thus the goodput is at most 1−f. For example, if R=1,000 packets/second, RTT=1 second, and B=3,000 packets, then f=0.25, i.e., 25% of the received packets are wasted. Thus, the goodput in this example is a meager 0.75 (compared to a maximum possible goodput of 1.0).
Note also in this example that the size of a block B together with the rate R implies that the latency introduced by the simple FEC-based reliability control protocol is at least 4 seconds (each block is transmitted for 4 seconds total), and requires buffering at least one block, i.e., 3,000 packets of data. Furthermore, to increase the goodput requires increasing the buffering, or conversely to decrease the buffering requires decreasing the goodput.
In view of the above, improvements in reliability control are desirable.