When speech data is transmitted over packet-switched data networks, the speech data is divided into individual data packets and communicated over the data network. The delay to which the data packets are subject when they are transmitted by the data network, or in other words their transit time, may change to different values in this case, i.e. although the data packets are emitted by a transmitter at regular intervals of time, they are not received by a receiver at these same regular intervals of time. It is even possible that the data packets may not arrive in the same sequence as that in which they were emitted by the transmitter. Hence, jitter occurs in the delay of the data packets when they are transmitted by the network. This jitter could also be described as fluctuations in transit time. The greater the jitter, i.e. the greater the fluctuations in transit time, the greater may be the difference in time between an ideal time of arrival of a data packet at the receiver, i.e. the time of arrival that the data packet would have in a data network whose delay was fixed, and the actual time of arrival.
To compensate for such delay jitter, what is normally used is a buffer store in the form of what is termed a jitter buffer, which provides an additional delay for the data packets to compensate for the delay jitter to which the data packets are subject. Where required, the data packets are also restored to their original sequence.
The procedure in this case is, in principle, to buffer store received data packets in the jitter buffer and to play the buffer-stored data packets out of the jitter buffer again at regular intervals, after an additional play-out delay. What the play-out delay is intended to ensure in this case is that the data packet which has to be played out at any given time is in fact, as far as possible, always available in the jitter buffer.
The size of the additional play-out delay for a data packet depends on when the data packet actually arrived. The greater is the difference in time between the arrival of the data packet at the jitter buffer and the intended play-out time, the greater too is the additional play-out delay that is needed for the data packet. The play-out delay should correlate in this case with the severity of the delay jitter on the data network, i.e. the greater is the delay jitter, the greater too must be the play-out delay selected, to prevent data packets from having to be discarded because they arrived later than their intended play-out time. The intended play-out time is defined by the fact that a unit to which the data packets are played out, such for example as a decoder for converting the data packets into speech data, having processed one data packet expects the next data packet for processing at a preset interval of time. If the data packet in question is not available in the jitter buffer, the missing data has to be replaced, by interpolation for example, and there is a drop in the quality with which the speech data is transmitted.
When speech data is transmitted over packet-switched data networks, the sum of all the delays on the data network has a considerable effect on the quality of the speech data that is transmitted. This is particularly important for telephone applications based on data networks, e.g. what is called IP telephony. It has been found in this case that there is a significant decrease in quality for two-way speech over the data network if the sum of all the delays on the data network exceeds 150 ms.
For this reason, it is important for the play-out delay applied by the jitter buffer to compensate for the delay jitter to be kept as short as possible. What has to be ensured in this case is that the play-out delay to the data packets is, on the one hand, as short as possible, but on the other hand is sufficiently long not to cause an excessive increase in the number of data packets which arrive later than their intended play-out time and consequently have to be discarded.
The delay jitter which typically occurs on packet-switched data networks is generally subject to variations in its severity over time. To stop the play-out delay applied by the jitter buffer from having to be attuned to the worst-case network conditions, i.e. to the most severe delay jitter affecting the data packets, the play-out delay applied by the jitter buffer may be corrected, i.e. adjusted to the network conditions, adaptively. This stops the quality of the speech communication from being degraded unnecessarily due to a play-out delay which is selected to be of a level which is too high in the majority of cases. Jitter buffers of this kind which are able to adjust their play-out delay adaptively to network conditions are called adaptive jitter buffers.
To adjust the play-out delay applied by the jitter buffer, it is known for the jitter in the delay of the data packets to be determined from the variations in the delay of the data packets. The variations in the delay of the data packets are also referred to as packet delay variation.
It is typical of data networks which are subject to delay jitter that the delay suffered by individual data packets across the data network varies. Because the total delay which a data packet experiences when travelling through the data network, i.e. its total transit time, is generally not known or can be determined only with considerable effort, it is not the total delay of the data packets on the data network that is normally determined to allow delay jitter to be established but instead it is the relative difference in delay between two successive data packets that is taken as a basis for establishing the said delay jitter. The relative difference in delay between two successive data packets is often termed packet-to-packet delay variation. The relative difference in delay between two successive data packets can be found with comparatively little effort by reference to a local clock at the receiver of the data packets.
In what follows, an example will be described of the calculation of the relative difference in delay between two successive data packets. The relative difference in delay between data packets which belong to a sequence of data packets which are identified by the subscripts n=0, 1, 2 . . .  will be referred to here as the PPDV (packet-to-packet delay variation). The time of arrival of the data packet having the subscript n will be called tn. This is the time, given by the local clock at the receiver, at which the data packet of subscript n arrived at the jitter buffer. In the same way, the time at which the next data packet arrived is called tn+1. If the period of time which corresponds to the data contained in the data packet is called dn, then the formula for the difference in delay between the data packet of subscript n and the data packet of subscript n+1 is:PPDV=(tn+dn)−tn+1.In the formula, tn+dn is the time at which the data packet of subscript n+1 ought to arrive if the said data packet of subscript n+1 were to encounter exactly the same delay on the data network as was encountered by the data packet of subscript n. The mean of the differences in delay which are calculated in this way is referred to as the mean packet-to-packet delay variation and is an approximation of the delay jitter experienced by data packets when travelling through the data network.
The data packets do not have to arrive in this case in the same sequence as that in which they were emitted by the transmitter. Something that is frequently found on data networks is that the data packets arrive out of order. For the above-described method of calculating the difference in delay, it is not essential for the times of arrival tn of the data packets to be sorted to enable the differences in delay to be calculated. Instead, to save on calculating work, the difference in delay can generally be calculated from:PPDV=(tn+dn)−tn+m In this formula, m can be both positive and negative and is determined in such a way that tn+m is the time of arrival of the data packet which arrives immediately after the data packet of subscript n. If the data packets arrive in the same sequence as that in which they are emitted by the transmitter, then m=1.
Other methods of calculating the differences in delay between two successive data packets are generally predicated on the presence of additional items of information in the data packets, such for example as a time marker in the form of a so-called timestamp, which represents the time of transmission or generation of the data packet as given by the clock. The calculation of the differences in delay between two successive data packets on the basis of the timestamp of the data packets and the times of arrival of the data packets at the receiver is defined for example in specification RFC 1889. For the calculation of the instantaneous delay jitter, provision is also made in this specification for a mean to be formed of the values calculated for the differences in delay of two successive data packets at a time, the mean being formed by a first order low-pass filter. The time constant for the low-pass filter is also defined in the specification. By means of this known method, it is possible to obtain a value which represents the mean size of the delay jitter which is occurring instantaneously on the data network.
However, if the value for the delay jitter on the data network that is determined in this way is used to determine the additional play-out delay applied by the jitter buffer, a problem arises in that the delay jitter is typically not evenly distributed. Instead, spells of increased delay jitter, which are termed jitter bursts, occur at intervals of time of greater or lesser regularity. Allowance has to be made for these jitter bursts when calculating the additional play-out delay applied by the jitter buffer. This may for example be done by multiplying the mean value for the delay jitter which is determined by the above method by a factor which is intended to make allowance for the actual distribution of the delay jitter. However, because this factor is preset at a fixed value, it has to be designed to cater for the worst-case distribution of the delay jitter, which means that what is generally obtained is an additional play-out delay which is appreciably longer than is actually necessary.
A different approach is known from, for example, U.S. Pat. No. 6,259,677 B1, in which the minimum delay of a data packet through the data network is determined. The total delays of the data packets through the data network can be determined on the basis of this minimum delay. From the total delays which have been determined in this way, a metric can be determined in turn for the delay jitter when the data packets are transmitted through the data network. However, on the one hand there is in this case the problem that the minimum delay through the network may change during the transmission of the data packets between the transmitter and receiver. If this happened, the calculation of the total delays, or rather of the differences in the total delays, would be affected by an error because it might under certain circumstances no longer be possible for the minimum delay previously determined still to be achieved. On the other hand, even with this method there are certain conditions that the distribution of the delay jitter has to meet, which means that in this case too the metric determined for the delay jitter has to have a safety factor applied to it to make allowance for the distribution of the network delay jitter.