In the transmission of speech data via packet-based data networks, the speech data are divided into individual data packets and transmitted over the data network. The runtime of the data packets can assume different values on transmission through the data network, i.e. the data packets are indeed emitted by a transmitter at regular time intervals but are not received by a receiver in the same regular time intervals. It is even possible that the data packets do not arrive in the same order in which they were emitted by the transmitter. Thus runtime fluctuations occur in the transmission of data packets through the data network. These runtime fluctuations are known as jitter. The greater the runtime fluctuations, i.e. the greater the jitter, the greater can also be the temporal interval between an ideal arrival time of a data packet at the receiver, i.e. the time of arrival of the data packet in a data network with a fixed runtime, and the actual arrival time.
To compensate for these runtime fluctuations usually a temporary memory is used in the form of a jitter buffer which provides an additional delay for the data packets to compensate for the runtime fluctuations of the data packets. If necessary the original order of the data packets is recreated.
The procedure in principle is to store received data packets temporarily in the buffer and retransmit the stored data packets from the buffer at regular intervals after an additional retransmission delay. The retransmission delay is intended to guarantee that the data packet to be retransmitted is as far as possible always available in the buffer.
The value of the additional retransmission delay for a data packet depends on whether the data packet has actually arrived. The greater the time interval between the arrival of the data packet in the buffer and the proposed retransmission time, the greater also the additional retransmission delay required for the data packet. The retransmission delay should correlate with the size of the runtime fluctuations of the data network, i.e. the greater the runtime fluctuations, the greater the retransmission delay must be set to avoid data packets having to be rejected because they arrive after their scheduled retransmission time. The scheduled retransmission time is defined in that a unit on which the data packets are retransmitted, for example a decoder to convert the data packets into speech data, after processing of a data packet expects the next data packet to be processed within a prespecified time interval. If this data packet is not available in the buffer, the missing data must for example be replaced by interpolation and the transmission quality of the speech data is reduced.
In the transmission of speech data via packet-based data networks, the sum of all delays in the data network considerably influences the quality of the speech data transmitted. This is particularly important for data network-based telephony applications, e.g. IP telephony. It has been shown here that the quality for counter-speech over the data network diminishes significantly if the sum of all delays in the data network exceeds 150 ms.
For this reason it is important to keep the retransmission delay through the buffer as short as possible to compensate for runtime fluctuations. It must firstly be guaranteed that the retransmission delay of the data packets is as small as possible, but again sufficiently large not to allow a disproportionate rise in the number of data packets arriving after their scheduled retransmission time and consequently having to be rejected.
The runtime fluctuations which typically occur in packet-based data networks normally have temporal variations in intensity. To avoid the retransmission delay through the buffer having to be matched to the poorest network conditions, i.e. the greatest runtime fluctuations for the data packets, the retransmission delay through the buffer can be adjusted adaptively i.e. adapted to the network conditions. This avoids the quality of the speech connection deteriorating unnecessarily due to a retransmission delay being selected too high for most cases. Buffers which can adapt their retransmission delay adaptively to the network conditions are also called adaptive jitter buffers.
For example WO 01/37468 A2 discloses a method for adapting the retransmission delay through a buffer, in which the occupation or fill level of the buffer is detected, i.e. the number of data packets stored therein. This takes place via a peak value detector, the value of which reflects the maximum occupation occurring in a specific time period. The read rate from the buffer and hence the occupation of the buffer are set as a function of the value of the peak value detector.
This procedure however offers only insufficient possibilities for keeping the retransmission delay to a minimum. If for example the data packets each comprise 10 ms speech data and the buffer is empty so that a data packet just arrived is transmitted again after just one millisecond, the actual hold time of the data packet in the buffer is just one millisecond. The occupation of the buffer in time units before arrival of the data packet is zero, immediately after arrival of the data packet 10 ms, and immediately after retransmission of the data packet again zero. The actual hold time of the data packet in the buffer, which is one millisecond, is thus only detected inadequately via the buffer occupation.
In order to achieve an effective minimizing of the additional retransmission delay through the buffer, as precise as possible a determination of the hold time is required. The mis-relationship between the hold time of the data packet in the buffer and the momentary occupation of the buffer becomes greater if the speech data for example are compressed using standardized speech compression G.723, as in this case even 30 ms of speech data are transmitted per data packet. For the example outlined above, the occupation of the buffer in time units would fluctuate between 0 and 30 ms, although the actual hold time of the data packet in the buffer is only one millisecond.
For data networks with very small delays (for example, AAL-based networks), the hold time of data packets in the intermediate memory can, as in the above example, be less than the length of the speech data in a data packet. Here the buffer occupation does not offer the necessary resolution to be able to set the retransmission delay through the buffer optimally. The retransmission delay through the buffer is therefore usually substantially greater than necessary. Furthermore problems occur on data networks in which it is not ensured that the data packets arrive in the same order as sent by the transmitter. In particular in this case it cannot be determined how many of the data packets arrive promptly before their scheduled retransmission time. It can for example occur that the buffer occupation, i.e. the number of data packets in the buffer, is large but the data packet to be transmitted next has not yet arrived in the buffer. If now the retransmission time of this data packet is reached, the missing data packet must be interpolated, which reduces the transmission quality.
U.S. Pat. No. 5,640,388 again discloses using not the buffer occupation but the hold time of the data packets in the buffer as a basis for setting the retransmission delay. The hold time of the data packets in the buffer is calculated from the difference between the arrival time in the buffer and the retransmission time from the buffer. This however means that the hold time can only be calculated when the data packet concerned has left the buffer again. In this case the analysis of the hold time of the data packet in the buffer is delayed by the actual hold time. The data packets in the buffer supply an amount for analysis of the hold time only when they have already been retransmitted. As a result, a reaction to changes in network conditions e.g. an increase in runtime fluctuations, can only be made after a delay. In such IP-based data networks the hold time of data packets is often 200 ms or more. This however means that a multiplicity of data packets is present in the buffer, the runtime fluctuations of which are not taken into account in setting the retransmission delay. It is therefore possible that because changing network conditions are taken into account too late, the buffer overflows or is fully emptied, i.e. an overrun or underrun of the buffer occurs.