Of the many protocols available for transferring data across networks such as the Internet, TCP/IP (which stands for Transmission Control Protocol/Internet Protocol) is the one most widely accepted. TCP ensures reliable transfers by transmitting data in separate packets. More precisely, TCP segments the data, and a TCP segment with an IP header forms an IP packet. The size of a TCP segment is bounded by the Maximum Segment Size (MSS), which is connection dependent with a default value of 536 bytes. A “sliding window” or “handshake” protocol partitions the transmission into three distinct phases. The first phase represents data ready to be sent. The second phase represents data that is either in transit or has arrived but has not yet been acknowledged. The third phase represents data that arrived successfully and has been acknowledged. Thus a 256 Kb file is broken into a number of packets, each of which passes sequentially through all three phases. An excellent text that provides a discussion of TCP/IP is Comer, Internetworking with TCP/IP, Volume I: Principles, Protocols, and Architecture Third Edition.
Not surprisingly, data traffic over networks can be susceptible to congestion. Congestion is a condition of severe delay caused by an overload of packets (or datagrams) at one or more switching points (e.g. at routers). When congestion occurs, delays increase and the router begins to enqueue packets until it can route them. Each router has a finite storage capacity and packets or datagrams have to compete for that storage (i.e. in a datagram-based Internet, there is no preallocation of resources to individual TCP connections). In the worst case, the total number of packets or datagrams arriving at the congested router grows until the router reaches capacity and starts to drop packets.
End points of a communication connection do not usually know the details of where congestion has occurred or why. To them, congestion simply means an increased delay or lost packets. Unfortunately, most transport protocols use timeout and retransmission, so they respond to increased delay or loss by retransmitting packets or datagrams. Retransmissions aggravate congestion instead of alleviating it. If unchecked, the increased traffic will produce increased delay, leading to increased traffic, and so on, until the network becomes useless. This condition is known as “congestion collapse”.
To avoid congestion collapse, TCP must reduce transmission rates when congestion occurs. Routers watch queue lengths and use certain techniques to inform hosts that congestion has occurred, but transport protocols like TCP can help avoid congestion by reducing transmission rates automatically whenever delays occur. Of course, algorithms to avoid congestion must be constructed carefully because even under normal operating conditions networks can exhibit wide variation in round trip delays.
To avoid congestion, the TCP standard now recommends using two techniques: slow-start and multiplicative decrease. These techniques are related and can be implemented easily. For each TCP connection, TCP must remember the size of the receiver's window (i.e. a “window” is defined as the buffer size advertised in acknowledgements). To control congestion, TCP maintains a second limit, called the “congestion window limit” or simply the “congestion window”. At any time, TCP acts as if the window size is the smaller of (1) the receiver's window, and (2) the congestion window.
In the steady state on a non-congested connection, the congestion window is the same size as the receiver's window. Reducing the congestion window size reduces the traffic TCP will inject into the connection. To estimate congestion window size, TCP assumes that most packet or datagram loss comes from congestion and uses the following strategy:                Multiplicative Decrease Congestion Avoidance: Upon loss of a segment, reduce the congestion window by half (down to a minimum of at least one segment). For those segments that remain in the allowed window, backoff the retransmission timer exponentially.        
Because TCP reduces the congestion window by half for every loss, it decreases the window exponentially if loss continues. In other words, if congestion continues, TCP reduces the volume of traffic exponentially and the rate of retransmission exponentially. If loss continues, TCP eventually limits transmission to a single packet or datagram and continues to double timeout values before retransmitting. The idea is to provide a quick and significant traffic reduction to allow routers enough time to clear the packets or datagrams already in their queues.
To recover from congestion, one might think that this process is simply reversed and that the congestion window is doubled when traffic begins to flow again. However, doing so produces an unstable system that oscillates wildly between no traffic and congestion. Instead, TCP uses a technique called slow-start to scale up transmission:                Slow-Start (Additive) Recovery: Whenever starting traffic on a new connection or increasing traffic after a period of congestion, start the congestion window at the size of a single segment and increase the congestion window by one segment each time an acknowledgement arrives.        
Slow-start avoids swamping the Internet with additional traffic immediately after congestion clears or when new connections suddenly-start.
The term “slow start” may be a misnomer because under ideal conditions, the start is not very slow. TCP initializes the congestion window to 1, sends an initial segment, and waits. When the acknowledgement arrives, it increases the congestion window to 2, sends two segments and waits. When the two acknowledgements arrive they each increase the congestion window by 1, so TCP can send 4 segments. Acknowledgements for those will increase the congestion window to 8. Within four round-trip times, TCP can send 16 segments, often enough to reach the receiver's window limit. Even for extremely large windows, it takes only log2N round trips before TCP can send N segments.
To avoid increasing the window size too quickly and causing additional congestion, TCP adds one additional restriction. Once the congestion window reaches one half of its original size before congestion, TCP enters a congestion avoidance phase and slows down the rate of the increment. During congestion avoidance, it increases the congestion window by 1 only if all segments in the window have been acknowledged. This is known as a linear increase phase.
Taken together, the slow-start increase, linear increase and multiplicative decrease behaviour of congestion avoidance, and exponential timer backoff improve the performance without adding any significant computational overhead to the protocol software.
Thus, it has been recognized for some time that TCP traffic sources tend to synchronize their behavior, producing oscillations in buffer occupancy levels at the bottleneck links of the networks. These oscillations are not desirable, as they are likely to imply greater queuing delays, or more packet loss, for a given level of bandwidth utilization.
Random Early Detection (“RED”) has been proposed to desynchronize TCP sources, and hence reduce the impact of these oscillations. RED is discussed in detail in Floyd and Jacobson, Random early detection gateways for congestion avoidance, IEEE/ACM Trans. On Networking, 1(4), 1993. RED attempts to reduce the impact of oscillations by smoothing the binary feedback signals sent by the congested buffer resource. Such binary feedback signals can take the form of packet loss events, or Early Congestion Notification (ECN) marks. See, for example, Floyd, TCP and explicit congestion notification, ACM Computer Communication Review, 24 pp. 10-23, 1994. Many variations of this initial proposal have been suggested and some recent work addressed the delicate issue of how to tune RED parameters to obtain maximal efficiency. See, for example, Firoiu and Borden, A study of active queue management for congestion control, Infocom 2000. A complementary proposal proposes the use of ECN marks which advocates marking packets instead of dropping them at times an active buffer management decision is made, informing the sources to back-off while avoiding unnecessary packet transmissions.
From ongoing work on implementations of RED and its variants, it appears that the achieved oscillation reduction in buffer occupancy is not as dramatic as was initially expected. One could reasonably argue that the rules proposed for deciding when to mark or drop packets can be further improved.
Accordingly, this invention arose out of concerns associated with improving the methods and systems that are used to address network congestion issues, particularly in the environment of the Internet.