The present invention relates to data processing by digital computer, and more particularly to TCP congestion control.
A typical TCP (Transmission Control Protocol) connection links two endpoints in a computer network to allow applications running on the two endpoints to communicate. For convenience, communication over a TCP connection will sometimes be described as one-way communication between a transmitting host and a receiving host. However, it should be understood that a TCP connection between two endpoints supports two-way communication between the two endpoints. Moreover, it should be understood that communication over a TCP connection linking two endpoints is between applications running on the two endpoints.
A transmitting host sends data to the receiving host using the TCP connection, and the receiving host returns acknowledgements (ACKs) to the transmitting host, acknowledging receipt of the data. The data sent from the transmitting host to the receiving host over the TCP connection is buffered in a send buffer associated with the TCP connection. The data in the send buffer is then packaged into TCP segments, and the TCP segments are sent to the receiving host. A variety of mechanisms exist to trigger a transmission of a TCP segment from the transmitting host to the receiving host. For an overview of these mechanisms, see RFC 793, Transmission Control Protocol. For convenience, a time when the transmitting host sends a TCP segment to the receiving host will be referred to as a “transmit time.”
If the transmitting host detects that a TCP segment sent to the receiving host has been corrupted or lost, the transmitting host resends that TCP segment to the receiving host. The transmitting host can use a variety of mechanisms for detecting loss or corruption of a sent TCP segment. In one implementation, the transmitting host determines that a TCP segment sent to the receiving host has been corrupted or lost if the receiving host does not acknowledge receipt of that TCP segment within a timeout interval (referred to as a “retransmission timeout”). The retransmission timeout is sometimes defined as a function of the average time it takes the transmitting host to send a TCP segment to the receiving host and receive an acknowledgement for that TCP segment from the receiving host (referred to as “round-trip time”). For an overview of other mechanisms for detecting loss or corruption of a TCP segment, see RFC 2581, TCP Congestion Control. 
When sending data to the receiving host on a TCP connection, the transmitting host estimates the maximum amount of data that can be sent at a time on that TCP connection without exceeding the capacity of the receiving host. For that purpose, a mechanism referred to as “flow control” is sometimes used. In the flow control mechanism, the receiving host regularly informs the transmitting host of how much new data the receiving host is capable of receiving on a particular TCP connection. In response, the transmitting host adjusts the rate at which it sends data to the receiving host on that TCP connection so as not to overrun the capacity of the receiving host.
In addition, when sending data on a TCP connection, the transmitting host estimates the maximum amount of data that can be sent at a time on that TCP connection without creating excessive congestion (and consequently delays) in the network. For that purpose, a mechanism referred to as “congestion control” is sometimes used. In the congestion control mechanism, the transmitting host uses a “congestion window” for a TCP connection to control the rate at which it sends data to the receiving host on that TCP connection. A congestion window for a TCP connection is defined by a state variable, which is used to limit the number of bytes of in-transit data (i.e., data sent by the transmitting host but not acknowledged by the receiving host) on that TCP connection. This state variable is often referred to as the size of the congestion window. The maximum number of bytes that a transmitting host can send on a TCP connection at any given time is equal to the difference between the size of the transmitting host's congestion window for that TCP connection and the number of bytes already in transit on that TCP connection at the given time.
The transmitting host generally increases the size of its congestion window for a particular TCP connection if the transmitting host timely receives an ACK for an in-transit TCP segment on that TCP connection (e.g., within the retransmission timeout interval). However, if the transmitting host detects loss of data on a TCP connection (e.g., if an ACK for an in-transit TCP segment is not received within the retransmission timeout interval), the transmitting host interprets the loss as a sign of network congestion and generally decreases the size of its congestion window for that TCP connection.
When the transmitting host starts sending data on a new (or previously-idle) TCP connection, the transmitting host typically uses a mechanism referred to as “slow-start” to control the size of its congestion window for the new (or previously-idle) TCP connection. In the slow-start mechanism, the transmitting host initially sets the size of the congestion window for a new (or previously-idle) TCP connection to a “maximum segment size” (MSS), i.e., the maximum number of bytes in a TCP segment. The transmitting host then increases the size of the congestion window by MSS bytes each time an ACK is received for an in-transit TCP segment on that TCP connection within the retransmission timeout interval. Once the transmitting host detects congestion, however, the transmitting host decreases the size of the congestion window and repeats the process.
As a result, the slow-start mechanism prevents the transmitting host from swamping the network with too much additional traffic on a new (or previously-idle) TCP connection. However, the slow-start mechanism often forces the transmitting host to operate below the capacity of the network when the transmitting host begins sending data on such a connection.