The present invention relates generally to the field of telecommunications. More specifically, the present invention relates to congestion control mechanisms in an internetwork using window adaptation.
Congestion occurs in a packet network when the requirements of the source(s) exceed the transport capability of the network or the reception capability of the receiver. For example, congestion occurs when multiple senders transmit packets to a network switch faster than the switch can forward the packets. Congestion results in the loss of packets due to a lack of buffering space in the switch.
Congestion control in a packet network is often performed using a window-based protocol between the source and destination. For example, congestion control in the current Internet is performed primarily by the Transmission Control Protocol (TCP). The TCP protocol uses a flow control algorithm between the sender and the receiver, where acknowledgments (acks) from the receiver are used to adjust a sliding window for the sender. Rather than sending a packet and waiting for an acknowledgment (ack) from the receiver before sending another packet, the sender keeps track of the total number of unacknowledged packets sent and continues to transmit packets as long as the number of unacknowledged packets does not exceed a specified window size. This window size determines the maximum number of packets that a receiver permits the sender to send to the receiver so that in the worst case, even if all the packets arrived at the receiver from the sender at once, the packets would not be lost at the receiver. The overall network capacity, however, can present a lower limit to the amount of traffic that can be handled. The sender determines the capacity of the network by probing the communication network to determine the network""s capacity and accordingly adjusts the window size. As long as there is no loss, the window size is gradually increased. When a loss occurs, the window size is reduced. Subsequently, the window size is expanded slowly. The sender can identify that a packet has been lost due to congestion either by the arrival of duplicate acks indicating a loss or by the absence of an ack being received within a timeout interval. This entire process of controlling the window size to limit congestion is known as flow control.
An important aspect of TCP congestion control mechanisms is that they do not assume any support from the network for explicit signaling of the congestion state. TCP infers the congestion state of the network from the arrival of acks, timeouts, and receipt of duplicate acks.
A number of compensation schemes can be used to reduce the window size upon detection of congestion and to gradually increase the window size back to the edge of congestion free operation. Such compensation schemes include the slow-start algorithm, fast-retransmit, and fast recovery. These schemes are described in detail in TCP/IP Illustrated, Vol. I, by Richard Stevens, Addison Wesley, 1994, which reference is incorporated by reference herein.
For example, under the slow-start algorithm, if the window size was one hundred packets when congestion was detected, the TCP protocol reduces the window size to one; the lost packet(s) is then retransmitted and the window size is expanded by one upon receipt of each subsequent acknowledgment, i.e., the indication of successful receipt of the transmitted packet at the destination.
The fast-retransmit mechanism improves upon the slow-start algorithm described above. The fast-retransmit mechanism retransmits a packet when three duplicate acks are received at the source, without waiting for the timeout period to expire. The reception of three duplicate acks is interpreted as a congestion indication. When three duplicate acks arrive, the fast-recovery algorithm reduces the window size by half rather than the more severe reduction to a window size of one as in the slow-start algorithm.
These known congestion mechanisms, however, are problematic when used in an internetwork containing a combination of rate-controlled and non-rate-controlled segments. Such an internetwork can include, for example, a set of TCP-based end systems attached to non-rate-controlled network segments running the Internet Protocol (IP), with these segments, in turn, interconnected by a rate-controlled Asynchronous Transfer Mode (ATM) network segment. In the absence of congestion-related losses, the TCP congestion window grows up to the maximum window size permitted by the destination. Rate control in the rate-controlled segment enables the maintenance of small queue lengths in the ATM switches within the ATM segment. The rate of the TCP traffic, however, may not be matched precisely to the rate provided by the ATM segment. Consequently, most of the packets transmitted within a given TCP window are buffered in the routers coupling the rate-controlled segment to the non-rate-controlled segments. Severe congestion and degraded throughput can result due to packet losses at the router.
One known system, Packet Shaper by Packeteer, Inc, provides rate control to alleviate congestion resulting from bursty traffic within a network using TCP. This system intercepts TCP acks and delays the forwarding of these acks while also substituting a different window size so that what would otherwise be bursty traffic flow is changed to smooth traffic flow. This system does not disclose the specific manner in which the TCP window size is updated. This known system attempts to avoid the excessive buffering at the router which would otherwise cause packet loss at the router.
It would be beneficial to avoid excessive buffering at the router, which would otherwise cause packet loss, without delaying the forwarding of acks and without modifying the TCP protocol semantics.
The present invention can control congestion in an internetwork having at least one controlled network segment and at least one non-rate-controlled network segment coupled by a router to prevent large queues of packets from accumulating in the router thereby potentially causing congestion and buffer overflows in the routers. The window size of connections passing through the routers are controlled based on the congestion level in the routers, so as to control the flow of packets into the internetwork, thereby controlling congestion. This results in improved throughout and fairness in the internetwork while minimizing losses due to buffer overflows in the routers.
The window size of a connection passing through a router can be explicitly controlled by the router by modifying the acknowledgments returned by the destination of the connection to its source. Connection-level state information need not be maintained in the router; knowledge of the bandwidth available to the connection or the delay incurred by its packets in the internetwork is not necessary. This is particularly applicable to connections using the Transmission Control Protocol (TCP), and is compatible with existing TCP implementations.
When a router receives an acknowledgment belonging to a connection passing through it, the router adaptively determines a second window size for the connection based on the buffer occupancy of the router. If the window size that is specified within the acknowledgment exceeds the second window size that is adaptively determined by the router, the window-size field in the acknowledgment is overwritten to select the second window size. The router then forwards the acknowledgment to the source of the connection. The source of the connection, in turn, can adjust the flow of packets into the internetwork based on the window size specified in the acknowledgments, thus adapting to changes in the congestion level at the router.