The Internet has become an important conduit for transmission and distribution of data (text, code, image, video, audio, or mixed) and software. Users connect to the backbone with broadly divergent levels of performance, ranging from 14.4 Kb/s to more than 45 Mb/s. Moreover, Transmission Control Protocol/Internet Protocol (TCP/IP) has become a widely implemented standard communication protocol in Internet and Intranet technology, enabling broad heterogeneity between clients, servers, and the communications systems coupling them. Internet Protocol (IP) is the network layer protocol and Transmission Control Protocol (TCP) is the transport layer protocol. At the network level, IP provides a “datagram” delivery service. By contrast, TCP builds a connection oriented transport level service over the datagram service to provide guaranteed, sequential delivery of a byte stream between two IP hosts. Application data sent to TCP is broken into segments before being sent to IP. Segments are sequenced by segment numbers.
Reliability in data transmission can be compromised by three events: data corruption, data loss, and reordering of data. Data loss is managed by a time-out mechanism. TCP maintains a timer (retransmission timer) to measure the delay in receiving an acknowledgment (ACK) from the receiver for a transmitted segment. When an ACK does not arrive within an estimated time interval, the corresponding segment is assumed to be lost and is retransmitted. TCP manages reordering of data or out-of-order arrival of its segments transmitted as IP datagrams by maintaining a reassembly queue that queues incoming packets until they are rearranged in-sequence. Only when data in this queue gets in sequence is it moved to the user's receive buffer where it can be seen by the user. TCP manages data corruption by performing a checksum on segments as they arrive at the receiver. On checksum, the TCP sender computes the checksum on the packet data and puts this 2-byte value on the TCP header. The checksum algorithm is a 16-bit one's complement of a one's complement sum of all 16-bit words in the TCP header and data. The receiver computes the checksum on the received data (excluding the 2-byte checksum field in the TCP header) and verifies that it matches the checksum value in the header. The checksum field also includes a 12-byte pseudo header that contains information from IP header (including a 4-byte “src ip” address, 4-byte “dest ip” address, 2-byte payload length, 1-byte protocol field).
The TCP protocol was initially designed to work in networks with low link error rates, i.e., all segment losses were mostly due to network congestion. As a result, the sender decreases its transmission rate each time a segment loss is detected. However, when a reordered packet is erroneously determined to be a packet lost in network congestion, the associated decrease in transmission rate from congestion control causes unnecessary throughput degradation.
Packet reordering is a common occurrence in TCP networks given the prevalence of parallel links and other causes of packet reordering. For instance, on Ether-channel where a number of real adapters are aggregated to form a logical adapter, packet reordering is commonly caused when packets are sent in parallel over these multiple adapters. In TCP, any data packets following the one that has been lost or reordered are queued at the receiver until the missing packet arrives. The receiver then acknowledges all the packets together. While the receiver is waiting for the lost packet to be retransmitted, no more data is sent. However, due to the reordering of packets, TCP sessions will automatically implement Fast Retransmit and Recovery because TCP will wrongly infer that network congestion has caused a packet loss. When reordering inadvertently triggers Fast Retransmit and Recovery or notification through Selective Acknowledgment Option (SACK), the congestion window is cut in half; and when a time-out occurs, the congestion window is set to one segment size, forcing slow start. Because these mechanisms automatically reduce the congestion window, such packet reordering inadvertently causes drastic degradation in network performance. It would be desirable to avoid such unnecessary throughput degradation due to packet reordering.