Inter-network nodes, or Internet-based nodes, are communication endpoints on a packet switched network of an inter-network and include, but are not limited to, computers, network routers, intermediate nodes on a network or inter-network, satellites, spacecraft, or any appliance or device that interfaces to a network or inter-network.
An amount of time required for data to propagate from a source node on an inter-network to the destination node on the inter-network is referred to as latency. Since some internal nodes operate as ‘store and forward’ devices, the end-to-end latency for a message can be much greater than a simple sum of point-to-point propagation times.
Examples of existing transfer protocols include Transmission Control Protocol (TCP) over Internet Protocol (IP). TCP is an explicitly windowed protocol. A sender and a receiver agree upon buffer sizes and then the sender tries to fill the receiver buffer without overflowing it. The sender's messages are based upon active acknowledgments of the receiver which indicate an amount of free space remaining in the receiver's buffer. The active acknowledgment also provides a form of flow control.
Another example of a similar transfer protocol is Apple's AppleTalk ADSP (AppleTalk Data System Protocol) which is also an explicitly windowed protocol. Additionally, Novell's SPX (Sequenced Packet Exchange) is a popular NetWare transport protocol. It was derived from the Xerox Networking Systems (XNS) Sequenced Packet Protocol (SPP).
With existing explicitly windowed inter-network protocols, which include but are not limited to TCP/IP, AppleTalk, and IPX, latency may limit the data transfer rate between any two nodes on the inter-network because the protocol requires the sender node to frequently stop sending data until some or all data sent thus far are acknowledged by the receiver node as successfully received. The acknowledgment is sent on the inter-network from the receiving node to the sending node and is itself subject to the network latency. The greater the network latency, the less data is sent per unit time, the lower average utilization of the channel, and the less the throughput of the protocol. Latency may limit the throughput of protocols of this type to significantly less that the capacity of the network.
FIG. 1 illustrates the latency problem for the TCP protocol in a TCP throughput graph 100. The latency issue graphically shown in FIG. 1 is similarly applicable to other explicitly windowed protocols as well.
Another way to describe the current practice is that it is pessimistic. In other words, the protocols require positive acknowledgment to proceed. The requirement for positive acknowledgment is based upon historic memory and bandwidth limits that are no longer applicable. In contrast, the throughput of the novel protocol, which is the subject of the present invention described herein, is largely independent of latency because it never stops sending data until the source of the data is exhausted. The novel protocol is optimistic in that it assumes that everything is fine and then deals with any failures as a matter of cleanup.
As a contemporaneous inter-network becomes congested, it loses data. Congested inter-network routers and other devices become overwhelmed with inter-network data and are forced to discard data they are expected to transmit. A common case is a router connecting several similar networks. If suddenly 100% of the traffic on three networks requires forwarding to a single network, the net amount of traffic intended for the single network may be more than a carrying capability of the physical network. Since most network technologies do not implement (or depend upon) flow control at this layer, the router is simply forced to let some messages go over the network and the remainder are silently discarded. Due to the huge numbers of packets flowing through central Internet routers at any point in time, the scenario described above (along with other situations which result in packet loss) occurs many thousands, perhaps millions of times per day throughout the Internet.
Existing protocols are forced to deduce that datagrams have been discarded by the fact that some of the expected traffic is not received. Also, existing protocols simply react to data loss by drastically slowing their rate of transmission and retransmitting lost data.
As noted above, TCP is the primary protocol for transmitting data long distance today. However, TCP has at least two major problems: (1) it is slow in the face of latency; and (2) it does not handle packet loss gracefully.
The reality is that no matter how much bandwidth is available, the mechanics of TCP involve procedures that, once latency crosses a threshold, the transmission process experiences dead time (where no new transmission is taking place while the sender is waiting for acknowledgments of data receipt from the receiver) and repeated retreats from transmission aggressiveness.
An underlying reason why hitting this transmission threshold is such problem relates to the way all variations of TCP respond to apparent transmission difficulties (e.g., data corruption and/or loss). Essentially, TCP implementers recognized that TCP, by its nature, fundamentally handles data corruption and loss extremely ungracefully. As long as transmission is proceeding cleanly and uninterrupted, data flow is consistent. The necessity to recover from a bad or missing transmission, however, involves temporary termination of transmission of new data while the system goes back and locates old data required by the receiver, retransmitting these data, and waiting for acknowledgment of receipt of that retransmission. This oversimplification of TCP behavior is mitigated by overlapping transmission of new and old data but it does capture the essence of the susceptibility of TCP to communication problems. In practice, any and all single disruptions of clean data flow in the network cause out-of-proportion disruptions of data flow internally in TCP behavior.
In light of this heavy impact of data corruption and loss, TCP does all it can to avoid such situations. Its primary defense mechanism is to slow down (or temporarily suspend) transmission to reduce exposure to problems. To accomplish this, a “congestion avoidance” mechanism precipitously decreases the rate data are injected into the network as soon as any resistance is met. As will be discussed in more detail below, TCP effectively behaves according to an expect-the-worst attitude whereby it starts with a slow transmission rate, accelerates as long as it meets no resistance in the form of data corruption or loss, and then retreats to its original rate (or some middle ground) upon such resistance.
Increasing bandwidth does, in fact, reduce the number of times dead time is suffered. On the other hand, sending more data at a time increases the transmission time of each block of data from the sender to the receiver. Similarly, using data compression of some type effectively increases the amount of data transmitted at any one time, but the fact remains that ineffective dead waiting time continues to be a major consumption of overall transmission time.
Compounding this dead waiting time issue is the algorithmic method that TCP uses to respond to network congestion or corruption that impacts the rate of data received by the receiver and the relative percentage of those data that are received intact and correct. Behaviorally, as the TCP receiver experiences success in receiving uncorrupted data and reports this success back to the sender in the form of its acknowledgments, the sender becomes more aggressive in its sending rate, arithmetically accelerating its rate of injecting data into the transmission stream. Once the receiver experiences a higher degree of failure than is allowed by its design, however, the result is precipitous. The increased failure rate is communicated explicitly and implicitly to the sender in the form of explicit notifications of corrupted datagrams and implicitly by the fact that the sender does not receive any acknowledgment of receipt of a datagram by the receiver (either successful or a notification of corruption). In response and by the design of TCP, the sender reduces its transmission rate geometrically and begins the arithmetic transmission acceleration process all over again.
One result of this additive-increase-multiplicative-decrease algorithm is that TCP transmission rates experience very distinct sawtooth behavior 203 as shown graphically in FIG. 2 in a data transmission rate graph 200. In contrast, various embodiments of a network protocol graph 201 of the present invention suffer much less of a dramatic impact of network congestion and corruption as graphically depicted.
In turn, actual communication throughput seldom approaches theoretical throughput. In fact, as shown in FIG. 1 above by the typical approach (shown by the sawtooth behavior 203 of the prior art) to increasing throughput—increasing bandwidth—actually experiences severe diminishing returns.
Attempts to move large amounts of data over merchant Internet connections with high latency and periodic packet loss can be frustrating and slow. Prior art protocols seldom achieve more than a few percent of a theoretical channel capacity between sender and receiver. Further, transfers often fail.
In an illustrative example, conventional TCP theory states that the optimal receive window size is the product of the channel bandwidth and the round trip time. For instance, if the channel's bandwidth is 100 megabits/second and the ping time (approximating the round trip time) is 700 milliseconds, then an optimal buffer size is
                                             100            ⁢                          megabits              second                        ×            700            ⁢                                                  ⁢            milliseconds                    =                    ⁢                                    100              ·                              10                6                                      ⁢                          bits              second                        ×            0.7            ⁢                                                  ⁢            seconds                                                        =                    ⁢                      70            ×                          10              6                        ⁢                                                  ⁢            bits                                                        =                    ⁢                      70            ×                          10              6                        ⁢                                                  ⁢            bits            ×                                          1                ⁢                                                                  ⁢                byte                                            8                ⁢                                                                  ⁢                bits                                                                                  =                    ⁢                      8.75            ×                          10              6                        ⁢                                                  ⁢            bytes                              In other words, this configuration would require approximately 9 megabytes of buffering assuming a true, zero loss channel as described.
If the buffer is too small, then the sender transmits until it believes that the receive buffer could be full (assuming no lost data), and then pauses for the acknowledgment. This results in ‘dead time’ or lost throughput.
If the buffer is too large, then the sender transmits as quickly as its timers allow until it begins to lose packets by overestimating the bandwidth of the channel. Then TCP begins to fluctuate its transmission rate as described below.
Another problem with TCP is inherent in all “sliding window” schemes. In essence, once a connection is created between the end points, the sender and receiver keep track of the amount of data which have been sent and how much space is left in the negotiated window. (The sender and receiver each reserve a memory buffer the size of the window.) The receiver acknowledges receipt of data which causes the window to progress through the data stream. However, at any point in time, there can never be more unacknowledged data than the window's size. In connections where the throughput is proportionally faster than the latency, the sender can send the entire window and then be forced to wait a period of time for the requisite acknowledgment to arrive from the receiver.
Further, TCP utilizes a congestion control system comprising “slow-start” and “congestion avoidance” aspects. In both of these areas, the traffic monitoring mechanism is dependent on the receipt of acknowledgments from the receiver that data have been successfully received. Failure to receive such an acknowledgment is interpreted to mean that data failed to reach the sender. In fact, however, such failure to receive an acknowledgment (in the time allotted) may in actuality be due to the fact that the acknowledgment itself was lost or corrupted or simply delayed due to traffic congestion. In other words, the traffic monitoring and controlling system is subject to the same problems to which the data transmission itself is subject. This is particularly deleterious, of course, in more error-prone environments such as wireless systems.
At all times, TCP maintains a “congestion window” that contains all unacknowledged data that are currently in transit (i.e., has been sent by the sender and the sender has not yet received an acknowledgment of successful receipt from the receiver). This congestion window starts out small and is increased during slow-start and congestion avoidance in reaction to successful transmissions.
Slow-start algorithms are implemented to “feel the network out” to avoid over-loading a network with more data than it can gracefully handle. Such systems work by sending either a small amount of data, or data at a low rate, in the beginning and increasing the amount sent each time an acknowledgment is received until either an acknowledgment is not received or a threshold is reached. Once either of these events occurs, the system enters a congestion avoidance phase.
Different congestion avoidance schemes have been put in place over the years (e.g., “Tahoe,” “Reno,” and “New Reno”); all of these schemes are variations on the theme of retreating on transmission aggressiveness in the face of data corruption/loss. In all cases, the rate of retreat is rapid and subsequent recovery relatively slow resulting in the sawtooth behavior 203 described with reference to FIG. 1, above.
In all implementations of TCP, transmission is very sensitive to data loss. Loss of a packet causes the receiver to time out and resend an acknowledgment in the hope that the sender will deduce that one or more messages have been lost and must be retransmitted. However, the loss of a single message can drastically slow a TCP transfer due to the internal timing of the protocol as described above with respect to the sawtooth behavior of TCP. Simply enlarging the window helps some, but ultimately real-world considerations, such as packet loss, reduce and ultimately nullify any gains.
The challenge of controlling the rate of injecting data into the transmission system would be daunting enough considering the factors described above but another major disruption involves the continuously changing aspect of the effective bandwidth available for the transmission. In reality, the effective bandwidth available to a sophisticated transmission system like many of those in the prior art where alternative communication paths are accessed and, of course, traffic congestion is constantly in a state of flux, is not a fixed constant defined by the size of the transmission line accessed by the user. Further, and compounding the problem to a point of virtual unpredictability, is the fact that the rate of data corruption/loss is also constantly changing. Due to constant changes in traffic congestion and other mitigating factors, the rate of data corruption/loss on just one communication line can change constantly, quickly, and unpredictably. Use of multiple communication paths increases this variability.
The rate of change is so rapid, in fact, that controlled pacing of traffic can, in essence, be so far out-of-phase with the actual available effective bandwidth that the controlling algorithm can have a deleterious impact on true throughput. In other words, the controlling algorithm can be injecting high volumes of data into a network at the exact time the network is susceptible to high transmission failures and vice-versa. A net result is that the sender is not taking advantage of potentially high volume clean transmission environments. Arguably much worse, it can flood congested and dirty networks resulting in extremely high data corruption/loss which leads to algorithmic slowing down of the transmission rate. In essence, this behavior can actually pre-determine that a transmission will take place at a much less than optimum rate during periods of traffic contention and/or corruption.
In reality, the only consistent way of dealing with all the degrees of uncertainty summarized above is to simply ignore them. Therefore, a new data transmission scheme is required.
However, there are reasons to believe that merchant Internet connections will not be getting much better in the near future. Increased availability of satellite connections (with very long latencies) makes them an economical alternative for high bandwidth transfers. Centralization of routing and connectivity dictates that long distance communications will continue to traverse multiple core routers where memory shortages and other circumstances may force packets to be discarded. In other words, as the Internet becomes bigger and faster, its latency and loss characteristic is likely to continue and to remain roughly proportional to contemporary experience (or worse).
One fundamental source of latency delay in long distance data transmission is the fact that the transmitting device inputs an amount of data into the transmission network and then waits for an acknowledgment from the receiver that the data have been successfully received, received but corrupted in some way or, when no acknowledgment is ever received, were not received at all. Depending on the status of the transmission, either those data are retransmitted or the next amount of data in the source file is transmitted.
If less data are actually transmitted (for example, through data compression or delta processing), the number of times the receiver must waste time waiting for acknowledgments is reduced. Several prior art schemes have taken advantage of this by merely transmitting less data.
However, merely transmitting less data will not solve the problem. The reliability of transmission networks decreases as the amount of congestion increases. The more a transmission source stresses the network by sending too much data too fast, the higher the rate of data corruption and loss (thereby demanding retransmission of data) becomes. Transmitting data at an artificially low volume and/or rate in order to protect the integrity of the transmission results in artificially low transmission rates. Several prior art schemes have addressed this problem by reactively reducing or increasing transmission volumes and rates in response to increased or decreased data loss and corruption. Other prior art schemes have addressed this problem by proactively reducing or increasing transmission volumes and rates by attempting to predict traffic congestion ahead of time and changing the rates accordingly.
Still other prior art methodologies have addressed this problem by identifying and taking advantage of multiple data transmission paths to the receiver and distributing the transmission among those paths.
The impact of the sender having to wait for receipt of an indication from the receiver that erroneous data have been received (or a lack of any notification from the receiver that a certain set of data has been received correctly or with errors) before sending subsequent sets of data is severe enough that it can be beneficial in prior art systems for the sender to actually send redundant data in the first place.
Under certain conditions, depending on the types of data corruption experienced and/or the amount of redundant data transmitted, data errors can be corrected by the receiver, thus reducing the amount of data retransmission required by the sender.
World-wide networks used in standard data transmission include a “Quality of Service” (QOS) facility by which different kinds of network traffic are prioritized (e.g., voice over IP has a higher priority than e-mail). The QOS facility can be managed to assist in assuring maximum throughput for a communication.
However, whether reactive or proactive, none of the techniques copes with latency in the manner of the present invention. All of the prior art methodologies continue to depend on back-and-forth communication between sender and receiver periodically while data transmission is suspended awaiting that communication.