1. Field
The disclosed methods and systems relate to congestion control in data networks. It has particular but not exclusive application to networks upon which data is communicated using a transport layer protocol that is a modification of and is compatible with the standard transmission control protocol (TCP).
2. Description of Relevant Art
A problem in the design of networks is the development of congestion control algorithms. Congestion control algorithms are deployed for two principal reasons: to ensure avoidance of network congestion collapse, and to ensure a degree of network fairness. Put simply, network fairness refers to the situation whereby a data source receives a fair share of available bandwidth, whereas congestion collapse refers to the situation whereby an increase in network load results in a decrease of useful work done by the network (usually due to retransmission of data).
Note that in this context “fairness” of access to a network does not necessarily mean equality of access. Instead, it means a level of access appropriate to the device in question. Therefore, it may be deemed fair to provide a high-speed device with a greater level of access to a channel than a slow channel because this will make better use of the capacity of the channel.
Past attempts to deal with network congestion resulted in the widely applied transmission control protocol. While the current TCP congestion control algorithm has proved remarkably durable, it is likely to be less effective on forthcoming networks that will feature gigabit-speed links and small buffer sizes. It may also be less effective where data is transmitted over long distances and comprises heterogeneous traffic originating from heterogeneous sources. These considerations have led to widespread acceptance that new congestion control algorithms must be developed to accompany the development of networking systems.
The task of developing such algorithms is not straightforward. In addition to the requirements discussed above, fundamental requirements of congestion control algorithms include efficient use of bandwidth, fair allocation of bandwidth among sources and that the network should be responsive rapidly to reallocate bandwidth as required. These requirements must be met while respecting key constraints including decentralized design (TCP sources have restricted information available to them), scalability (the qualitative properties of networks employing congestion control algorithms should be independent of the size of the network and of a wide variety of network conditions) and suitable backward compatibility with conventional TCP sources.
To place the disclosed methods and systems in context, the existing TCP network model will now be described. The TCP standard defines a variable cwnd that is called the “congestion window”. This variable determines the number of unacknowledged packets that can be in transit at any time; that is, the number of packets in the ‘pipe’ formed by the links and buffers in a transmission path. When the window size is exhausted, the source must wait for an acknowledgement (ACK) before sending a new packet. Congestion control is achieved by dynamically varying cwnd according to an additive-increase multiplicative-decrease (AIMD) law. The aim is for a source to probe the network gently for spare capacity and back-off its send rate rapidly when congestion is detected. A cycle that involves an increase and a subsequent back-off is termed a “congestion epoch”. The second part is referred to as the “recovery phase”.
In the congestion-avoidance phase, when a source i receives an ACK packet, it increments its window size cwndi according to the additive increase law:cwndi→cwndi+αi/cwndi  (1)where αi=1 for standard TCP. Consequently, the source gradually ramps up its congestion window as the number of packets successfully transmitted grows. By keeping track of the ACK packets received, the source can infer when packets have been lost en route to the destination. Upon detecting such a loss, the source enters the fast recovery phase. The lost packets are retransmitted and the window size cwndi of source i is reduced according to:cwndi→βicwndi,  (2)where βi=0.5 for standard TCP. It is assumed that multiple drops within a single round-trip time lead to a single back-off action. When receipt of the retransmitted lost packets is eventually confirmed by the destination, the source re-enters the congestion avoidance phase, adjusting its window size according to equation (1). In summary, on detecting a dropped packet (which the algorithm assumes is an indication of congestion on the network), the TCP source reduces its send rate. It then begins to gradually increase the send rate again, probing for available bandwidth. A typical window evolution is depicted in FIG. 1 (cwndi at the time of detecting congestion is denoted by wi in FIG. 1).
Over the kth congestion epoch three important events can be discerned from FIG. 1. (A congestion epoch is defined here as a sequence of additive increases ending with one multiplicative decrease of cwnd.) These are indicated by ta(k); tb(k) and tc(k) in FIG. 1. The time ta(k) is the time at which the number of unacknowledged packets in the pipe equals βiwi(k). tb(k) is the time at which the pipe is full so that any packets subsequently added will be dropped at the congested queue. tc(k) is the time at which packet drop is detected by the sources. Time is measured in units of round-trip time (RTT). RTT is the time taken between a source sending a packet and receiving the corresponding acknowledgement, assuming no packet drop. Equation 1 corresponds to an increase in cwndi of αi packets per RTT.
The foregoing discussion relates to AIMD sources where the increase and decrease parameters αi and βi are constant (although they may differ for each source). A number of recent proposals for high-speed networks vary the rate of increase and decrease as functions of window size or other values. The approach is readily extended to these protocols by extending the model to include time varying parameters αi(k) and βi(k) and defining the model increase parameter to be an effective value, for example such thatαi(k)=(wi(k+1)−βiwi(k))/(tc(k)−ta(k))
The current TCP congestion control algorithm described above may be inefficient on modem high-speed and long distance networks. On such links, the window sizes can be very large (perhaps tens of thousands of packets). Following a congestion event, the window size is halved and subsequently only increased by one packet per round-trip time. Thus, it can take a substantial time for the window size to recover, during which time the send rate is well below the capacity of the link. One possible solution is to simply make the TCP increase parameter αi larger, thereby decreasing the time to recover following a congestion event and improving the responsiveness of the TCP flow. Unfortunately, this direct solution is inadmissible because of the requirement on lower speed networks for backward compatibility and fairness with existing TCP traffic. The requirement is thus for αi to be large in high-speed networks but unity in low-speed ones, naturally leading to consideration of some form of mode switch. However, mode switching creates the potential for introducing undesirable dynamic behaviors in otherwise well behaved systems and any re-design of TCP therefore needs to be carried out with due regard to such issues.