Generally, in the field of communication one distinguishes between circuit-switched connections and data unit switched communications. In a data unit switched connection, an amount of data to be sent is divided into data units, and these data units are sent in accordance with a protocol governing the communication. It may be noted that the data units receive different names in the context of different protocols, such as packets, frames, segments, etc., In the present application the term “data unit” is used generically to relate to any such subdivision of data.
In order to ensure the reliable transmission of data, many protocols provide the feature of data unit retransmission. More specifically, data unit retransmission means that a data unit receiver implements a feedback mechanism according to which the receiver sends feedback messages to the data unit sender, where each feedback message contains information on the receipt of data units sent by the data unit sender. The type of information in the feedback message can be of various nature, e.g. can acknowledge the correct receipt of a data unit and/or indicate an error in a received data unit. An example for such feedback messages are the acknowledgement messages or ACKs known from TCP and other protocols. The data unit sender reacts to these feedback messages by retransmitting one or more of the sent data units, depending on the information in the feedback messages.
A feature that is typically provided in conjunction with a retransmission mechanism is a so-called retransmission time-out. A time-out feature means that the data unit sender retransmits a given data unit if the data unit sender does not receive within a given time-out period a feedback message indicating the correct receipt of said given data unit. This feature ensures that if a data unit is lost, then the lost data unit will automatically be retransmitted after the above mentioned time-out period.
An example of a protocol that provides a retransmission mechanism accompanied by a time-out feature is the so-called Transmission Control Protocol (TCP), which is a part of the well-known TCP/IP protocol suite.
In the communication between a given sender and a given receiver, it is clear that the time-out period should in some way depend on the response time. The response time is indicative of the time that passes between the sending of a data unit and the receipt of a feedback message relating to that data unit. In TCP and some other protocols, this response is also called the Round Trip Time (RTT).
If the receiver is “distant” (i.e. long response time), then the time-out period should be set longer than for a “close” receiver (i.e. short response time). It is equally understandable, that the time-out period should be set as long as necessary and as short as possible. Namely, if the time-out period is too short, then the data unit sender will not wait long enough for the receipt of a feedback message, and thereby unnecessarily retransmit a given data unit. Such an unnecessary retransmission is also called a spurious retransmission. In other words, a spurious retransmission means that if the sender had waited somewhat longer, it would have received a feedback message and-thereby not retransmitted the data unit. On the other hand, if the time-out period is set too long, then this leads to unnecessary delays in the transmission, as the retransmission of lost data units does not occur soon enough.
Methods for properly calculating a retransmission time-out period on the basis of response time measurements have been in discussion for quite some time, e.g. in RfC 889 dating from 1983. In connection with TCP, the presently used way of updating the retransmission time-out period RTO is defined in RfC 2988. According to this RfC, the updating of RTO on the basis of measured values of the response time or roundtrip time RTT is:Δ=RTT−SRTT SRTT←SRTT+⅛·ΔRTTVAR←RTTVAR+¼·(|Δ|)−RTTVRR)RTO=max(SRTT+4·RTTVAR, 1 sec).
SRTT represents a smoothed average of the roundtrip time RTT, and RTTVAR represents an indication of the variance of RTT. As a consequence, the concept of RfC 2988 consists in updating or adapting the retransmission time-out period as a weighted sum of a smoothed average of the response time and the variance of the response time, where a minimum value of 1 second is maintained.
The concept defined in RfC 2988 has a number of flaws. It is understandable that the value of the retransmission time-out period should follow the measured values of the response time. In other words, if RTT increases, then RTO should increase, and if RTT decreases, then RTO should decrease. However, this is not always the case with the above-described method of updating RTO. If a sudden drop in RTT occurs, then the fact that the absolute value of Δ is used in calculating RTTVAR leads to an increase in RTTVAR, which then eventually also leads to a sharp increase in RTO. Therefore, although RTT has decreased, RTO has increased, which leads to unnecessary delay in the sending of data units.
Another problem with the above-described concept of RfC 2988 lies in the so-called “magic numbers” ⅛ and ¼ used as weighting factors. These factors have been chosen and are optimised to the case when only one measurement RTT is performed per RTT period (i.e. only one RTT is measured at once). However, these factors do not lead to satisfactory results in the updating of RTO when RTT measurements are made for every sent data unit, e.g. by using time stamps, when the number of outstanding data units is large, or if there is no significant variation between consecutive RTT samples. Such a situation may occur when a large queue is maintained in front of a link with limited bandwidth, e.g. a wireless link or a modem link.
It is noted that the term “outstanding data unit” refers to a data unit that was sent, but for which no feedback has yet been received, e.g. no acknowledgment message.
In the above-described situation, the equations proposed by RfC 2988 lead to a situation where SRTT converges to RTT. Δ converges to zero, such that RTTVAR also converges to zero. As a consequence, the value of RTO converges to RTT. This is undesirable, because RTT can be seen as the absolute minimum value for RTO, as one cannot expect to receive a feedback message in a time shorter than RTT. As a consequence, the above-described phenomenon of RTO “collapsing” into RTT leads to a highly increased probability of spurious retransmissions.
WO 01/13587 A2 proposes methods for an improved updating of RTO. These methods e.g. consist in making the calculation of RTTVAR dependent on a threshold condition for RTT, in making the weighting factors adaptive and in additionally making the updating of RTO dependent on the number of spurious retransmissions.