1. Technical Field
The present invention relates to throughput optimization of Internet protocol (IP) based applications, such as transmission control protocol (TCP), over links such as cellular, where long interruptions of the application traffic can be caused by error bursts or system design.
2. Discussion of Related Art
As shown in FIG. 1, a sending host provides datagrams according to the Internet Protocol (IP) on a line 12 over an Internet 14 on a line 16 to a receiving host 18. Each of the hosts 10, 18, may be thought of as having protocol software on each machine stacked vertically into layers, each of which handles different functionalities in the communication of datagrams. The higher layers deal with end-to-end application issues while lower layers handle issues relating to transfer of packets or datagrams through the network. In traversing a network such as the Internet 14 shown in FIG. 1, the datagrams of a given message may traverse different routes through different routers from the host 10 to the host 18. The intermediate routers also have protocol stacks but the datagrams do not need to consult higher levels of the routers because only lower layers are needed to receive, route and send datagrams.
For instance, the host 10 may send a datagram which passes up to the IP layer on intermediate routers on the way to the receiving host 18 but no higher. Only when the datagram reaches the receiving host 18 does IP extract the message and pass it up to higher levels of the protocol software.
In practice, delays and loss may occur between the end point hosts 10, 18 due to congestion in the routers and other devices of the Internet 14 as well as lack of storage space in the receiving host 18. Even severe delays can be caused by an overload of datagrams at one or more switching points, routers, or the like. When this happens, delays increase as routers begin to pile up datagrams until they are able to send them forward. But since the storage capacity of each router is not unlimited and since datagrams compete for that storage space, it is possible in an uncoordinated network such as the Internet that the number of datagrams arriving at a congested router will be too much for it to handle and it will be forced to drop datagrams.
If that happens, the hosts 10, 18 would not normally know the details of where this congestion has occurred or why. To them, an unexpected delay or loss is a premonition for congestion. This attribution is due to the fact that in wired networks, for which the Internet was designed, the successful transmission of datagrams between hosts and routers and between routers is very reliable, and congestion is a good assumption as the cause of the delay and loss. For this reliable wired environment the Internet designers provided for certain responses to perceived congestion.
One of these is for the TCP layer to use a specialized sliding window mechanism as shown in FIG. 2 that is used for several purposes. This window makes it possible to send multiple segments (the unit of transfer between TCP layers on two machines is called a segment) from the host 10 before an acknowledgement arrives, so as to increase total throughput. It also has a flow control purpose that allows the receiving host 18 to restrict transmission until it has sufficient buffer space to accommodate more data. The window operates at the octet level, not at the segment or datagram level (TCP segments are encapsulated within IP datagrams). Octets are numbered sequentially as shown in FIG. 2. Whenever a sending host sends a TCP segment it puts the sequence number of the first octet in that segment and in return expects an acknowledgement from the receiver for the last octet the receiver has successfully received. The sending host 10 keeps three pointers associated with every connection. The first pointer marks the left of the sliding window to separate octets on the left (1, 2) that have been sent and acknowledged from octets yet to be acknowledged. A second pointer marks the right of the sliding window and defines the highest octet (9) in the sequence that can be sent before more acknowledgements are received. The third pointer marks the boundary inside the window that separates those octets that have already been sent (3-6) from those octets that have not yet been sent (7-9). The receiving host 18 maintains a similar window to piece the stream together again after a plurality of datagrams traverse the Internet 14, possibly over different routes using different routers and arriving out-of-sequence.
TCP allows the window size to vary over time. It does that by having the receiving host 18 specify not only how many octets have been received but how many additional octets of data it is prepared to accept. This is carried out by a so-called window advertisement which can be thought of as specifying the receiving host""s current buffer size. When the receiving host 18 causes an increase to its window advertisement, the sending host 10 increases the size of its sliding window. Likewise, when the receiving host 18 signals decreased buffer space with a decreased window advertisement, the sending host 10 decreases the size of its window and stops sending octets beyond the boundary. An advantage of all this is of course that it provides flow control as well as reliable transfer. If the receiving host 18 buffer begins to get full, it can send smaller window advertisements. It can even send a window size of zero to stop all transmissions. It can later advertise a non-zero window size to trigger the flow of data again once buffer space is available again.
In addition to flow control, TCP maintains a second limit, called the congestion window limit or congestion window to control congestion. The goal of the flow control window is to ensure that the sender does not send more data than what the receiver can actually accommodate. However, in many cases it is the networks which may not have enough space to accommodate all the data that the sender is sending. As alluded to before, in case of congestion the network buffer space may get exhausted and the data packets might get dropped. In addition, if there are a plurality of senders and if neither of them regulates the rate of data being sent the network will never be able to make room for any single connection, and the congestion may persist for a very long time. To avoid this, TCP uses a congestion window that tries to estimate the amount of buffer space available in the network. To summarize, TCP does not solely rely on the window advertised by the receiving host in deciding how many packets to send. It instead takes the minimum of the advertised window and the congestion window to decide how much data can be sent into the network without waiting for an acknowledgement from the receiver.
When TCP experiences datagram loss, it adopts two different strategies to adjust its congestion window. At the start of the connection when the TCP has no information about the state of the network, it begins by sending one packet into the network and waiting for an acknowledgement from the receiver. An acknowledgement implies that the network had sufficient space for at least one packet from this connection to accommodate. It then injects two new packets to see if the network can accommodate two packets and waits for the acknowledgement of these new packets. This process of probing the network for its buffer space, called slow-start, continues until the sender sees a packet loss. A packet loss indicates that the network has run out of its capacity to enqueue data at a rate higher than this and therefore the sender must not try to be too aggressive in sending more packets. However, since the buffer space in the network keeps changing (e.g., another TCP connection ended and the buffer was freed-up because of this), the sender still keeps trying to increase its congestion window for every acknowledgement received, but at a rate considerably slower than slow start.
Anytime during the slow start or congestion avoidance, if a packet is lost (but the sender is able to still receive some acknowledgements) the sender reduces its window by one half. In other words, TCP reduces the volume of traffic exponentially (as well as the rate of retransmission). This is called multiplicative decrease. However, if the sender does not receive any acknowledgements within a certain time period, that indicates that the network was very heavily congested and no packets were moving. Under these circumstances the TCP sender again enters slow start and increases its window by one segment for every acknowledgement received. This allows routers enough time to clear datagrams already in their buffers.
As the problem goes away, the sender will again reach an equilibrium state around which its congestion window will undulate. Whenever the traffic is suspended for a period of time more than the timeout period of the sender, the sender initiates the slow start algorithm where the congestion window is started at the size of a single segment and increased by one segment each time an acknowledgement arrives. When the acknowledgement to the initial segment arrives, TCP increases the congestion window to two, sends two segments and waits. When the two acknowledgements arrive, they each increase the congestion window by one, so TCP can send four segments. Acknowledgements for those will increase the congestion window to eight. Within four round-trip times, TCP can send sixteen segments, soon reaching a limit such as the receiver""s advertised window, at which point the congestion window equals the advertised window and further increases cease, or another limit called the slow start threshold size. If the slow start threshold size is reached, the above described exponential increase is stopped and thereafter the increase is made linear. This is an additive increase, compared to slow-start""s exponential increase. In other words, slow start dictates an exponential increase which ceases when the slow start threshold size is reached, after which congestion avoidance takes over.
Slow start avoids swamping the Internet with additional traffic immediately after congestion clears, or when a new connection starts. However, the assumption of delays being caused by congestion may not always be correct. As shown in FIG. 3, a mobile host 30 is in communication over a radio link 32 with an access network 34 connected to an Internet 36 by a wired connection 38. The Internet 36 is populated with a plurality of routers, switches, and other devices similar to the situation explained above in connection with FIG. 1. A given packet may traverse the Internet 36 over a path which is different from other packets in the same message. Nonetheless, under ideal conditions, all of the packets of the message arrive at an intended recipient network host 40 on a wired line 42 emerging from the Internet 36. In a case such as shown in FIG. 3, the assumption made for wired networks (such as shown in FIG. 1) that all of the connections are highly reliable is not correct. The wireless link 32 is subjected to a much higher level of error than a wired connection, and the response to congestion designed into the TCP/internet protocol suite described above is not necessarily ideal. In case a lost datagram is due to a problem on the radio link 32 related to radio impairments, it would not be optimum for the mobile host to use the known slow-start recovery algorithm after a stoppage. Also not optimum would be a case of recovery after a deliberate suspension of service. Such might occur for instance in the GPRS suspend state described in European Patent publication number 1161036 entitled xe2x80x9cSuspend Statexe2x80x9d of Kuusinen et al filed May 31, 2000 as EPA 01660087.6. This is a problem that needs to be solved to allow efficient deployment of packet-based services on mobile devices.
In the above-mentioned European Patent Publication No. 1161036, based on EPA 01660087.6 filed May 31, 2000, the problem of slow start caused by GPRS suspend was addressed wherein the inventors proposed a source quench method by advertising a null window.
In the IWTCP (TCP Performance over wireless links) Final Report by A. Gurtov et al published by the University of Helsinki Department of Computer Science, Oct. 24, 2000, it was proposed to use an accelerated slow start with k=2 to reach the congestion avoidance state faster. However, there will then be many packets lost at the end of the slow start. To attempt to address this problem, the authors have recommended using a smaller buffer size at the last hop router to avoid a large number of packet losses. But a reduced buffer would have a negative effect on the throughput in the congestion avoidance state.
There are proposals to split the original TCP connection into separate connections for the wired and wireless parts of the path. See xe2x80x9cImplementation and Performance Evaluation of Indirect-TCPxe2x80x9d, by A. Bakre et al, IEEE Transactions on Computers, Vol. 46, No. 3, March 1997, pp. 260-78. On the wireless part, a protocol optimized for error recovery may be used. Drawbacks of that approach include violation of the end-to-end TCP semantics, since acknowledgements may reach the sender before the data reaches its destination, significant overhead caused by the back-to-back processing and considerable per-connection state maintenance.
An object of the present invention is to provide a solution to the above-mentioned problem of an inefficient slow-start procedure in a recovery following a non-congestion situation.
According to a first aspect of the present invention, a method, for increasing traffic from a sender to a receiver in a communications system after a period of packet delay or loss existing in a connection between said sender and said receiver by starting a congestion window at a size of a single segment and increasing the congestion window by one segment each time an acknowledgement arrives, is characterized by the sender accelerating the starting after a period of a detrimental radio conditions existing on a radio link of the connection concurrent with the period of packet delay or loss existing in the connection.
Further according to the first aspect of the invention, the method is characterized by the sender accelerating the starting by at least one of starting the congestion window at a first size greater than a single segment, and increasing the congestion window by a second size greater than a single segment each time an acknowledgement arrives.
Further still according to the first aspect of the invention, the method is characterized by ceasing the increasing the congestion window by the second size greater than a single segment upon the congestion window reaching a size equal to or greater than a slow start threshold size. A transition may then be made from the sender accelerating the starting to congestion avoidance procedure.
Further in accord with the first aspect of the invention, the method is characterized by the sender accelerating the starting after a period of deliberate suspension of service over the connection between the sender and the receiver.
Yet further in accord with the first aspect of the invention, the method is characterized by the receiver comprising a mobile host, and by signaling existence of the detrimental radio condition from the mobile host to the sender.
Yet further still in accord with the first aspect of the invention, the method is characterized by the receiver comprising a mobile host, and by signaling existence of the detrimental radio condition from a radio access network (RAN) to the sender. The term xe2x80x9cradio access networkxe2x80x9d is used generically to include the third generation (3G) RAN, as well as network nodes or elements that are not classified as part of the 3G RAN. Examples are GPRS SGSN or GGSN.
Further still according to the first aspect of the invention, the method is characterized by the receiver comprising a mobile host, by signaling existence of detrimental radio conditions from a radio access network to mobile host, and by signaling from the mobile host to the sender via the radio access network.
Yet further still according to the first aspect of the invention, the method is characterized by the sender accelerating the starting by increasing the congestion window by a size greater than a single segment each time an acknowledgement arrives and by dividing the congestion window by a factor upon a timeout or packet loss to obtain a reduced slow-start threshold indicative of a point at which a transition from the accelerating the startup to congestion avoidance is needed.
According to a second aspect of the invention, a sending device able to recover from a condition of packet delay or loss existing in a connection between the sending device and a receiver by starting a congestion window at a selected size and increasing the congestion window each time an acknowledgement arrives, is characterized by means for identifying a period of a detrimental radio condition existing on a radio link of the connection, and by means for accelerating the starting after a period of the detrimental radio condition existing on the radio link of the connection concurrent with the period of packet delay or loss existing in the connection.
Further still according to the second aspect of the invention, the means for accelerating the slow start is characterized by a means for ceasing the increasing the congestion window by the second size greater than a single segment upon the congestion window reaching a size equal to or greater than a slow start threshold size.
Still further according to the second aspect of the invention, the means for accelerating the slow start is characterized by a means for transitioning from the sender accelerating the starting to a congestion avoidance procedure.
Further in accord with the second aspect of the invention, the sending device is characterized by the means for accelerating the starting after a period of deliberate suspension of service over the connection between the sending device and the receiver.
Still further in accord with the second aspect of the invention, the sending device is characterized by the receiver comprising a mobile host, and by the mobile host signaling the sending means the existence of the detrimental radio conditions.
Further still in accord with the second aspect of the invention, the sending device is characterized by the receiver comprising a mobile host, and by the existence of the detrimental radio conditions signaled from a radio access network to the sending device.
Yet further still in accord with the second aspect of the invention, the sending device is characterized by the receiver comprising a mobile host, by signaling existence of the detrimental radio conditions from a radio access network to the mobile host, and by signaling from the mobile host to the sending device via the radio access network.
In still further accord with the second aspect of the invention, the sending device is characterized by said receiver comprising a mobile host, by existence of the detrimental radio conditions signaled from a radio access network to the mobile host, and by the existence of the detrimental radio conditions signaled from the mobile host to the sending device via the radio access network.
Further still according to the second aspect of the invention, the sending device is characterized by means for the sender to accelerate the starting by increasing the congestion window by a size greater than a single segment each time an acknowledgement arrives and by means for dividing the congestion window by a factor upon timeout or packet loss to obtain a reduced slow start threshold indicative of a point at which a transition from the accelerating the starting to congestion avoidance is needed.
According to a third aspect of the invention, a communications system comprising a plurality of hosts able to communicate with datagrams sent via a network in between the hosts, wherein a sending host increases traffic to a receiving host after a period of packet delay or loss existing in a connection between the sending host and the receiving host by starting a congestion window at a selected size and increasing the congestion window each time an acknowledgement arrives from the receiving host is characterized by the sending host having means for accelerating the starting after a period of detrimental radio conditions existing on a radio link of the connection concurrent with the period of packet delay or loss in the connection.
Further according to the third aspect of the invention, the system is characterized by the sending host accelerating the starting after a period of deliberate suspension of service over the connection between the sending host and the receiving host.
Still further according to the third aspect of the invention, the system is characterized by the receiving host comprising a mobile host, and by means for signaling existence of the detrimental radio conditions from the mobile host to the sending host.
Further still according to the third aspect of the invention, the system is characterized by the receiving host comprising a mobile host, and by means for signaling existence of the detrimental radio conditions from a radio access network to the sending host.
Yet further still according to the third aspect of the invention, the system is characterized by the receiving host comprising a mobile host, and by means for signaling existence of the detrimental radio conditions from a radio access network to the mobile host, and by means for signaling from the mobile host to the sending host via the radio access network.
The invention helps the TCP sender increase its data rate faster in slow-start when it is known that radio conditions or a design suspension and non-network congestion caused the sender to go to slow-start. Since the congestion avoidance state is much more optimal for the throughput in slow-start, overall TCP throughput is improved.
Specifically, the TCP sender, upon notification that radio conditions (such as link losses or GPRS suspend) caused the interruption, leading to slow-start, will perform an accelerated slow-start instead of a regular slow-start. In the accelerated slow-start, for instance, the window is increased by k at every acknowledgement received, with k greater than 1. In addition, when there is a packet loss during slow start, the slow start threshold is recalculated in that it is set equal to the congestion window divided by L. L is suggested to be (k+1) and not two, as in the case of regular slow start. However, other values of L are possible. During the congestion avoidance phase, however, the window is still reduced by half irrespective of the mechanism used during slow start. Otherwise, the TCP sender performs regular slow-start, i.e., increasing the congestion window by one for every acknowledgement. Regular slow-start is also performed at the beginning of the TCP connection.
The present invention preserves the cautious probing of TCP slow-start when slow-start is caused by network congestion, but speeds up the transition from slow-start to the above-described congestion avoidance (multiplicative decrease) when slow-start is caused by radio conditions (link losses or GPRS suspend, for instance). No change is required at the TCP receiver, and interoperability is fully preserved. No state maintenance is needed in the network, and the end-to-end semantics of TCP is preserved. The principle is broadly applicable to various access networks, including EDGE, CDMA 2000 and WCDMA.
Although there is the disadvantage of needing to implement and accelerate a slow-start at the TCP sender, which is a new requirement compared to standard TCP, the fact that there is no other change required in the Internet or in the receiving host is a small cost to pay for the increased efficiency and the solution to the problem presented above. There is a very slight risk that the network congestion could occur shortly after the radio condition leading to slow-start. In that case, accelerated slow-start may not be as optimal as regular slow-start, but the negative effect on network congestion is mitigated, since there is still a gradual and cautious increase in the data rate during accelerated slow-start. Furthermore, the chance of network congestion happening at roughly the same time as detrimental radio conditions is small.
These and other objects, features and advantages of the present invention will become more apparent in light of the following detailed description of a best mode embodiment thereof, as illustrated in the accompanying drawing.