When a data stream such as an audio stream is transmitted over a packet-based network such as the Internet, it is liable to experience some amount of delay due to factors such as packet queuing, route, and/or loss (which may require retransmission). Furthermore, this network delay is not constant but rather varies over time. This effect is known as jitter. In real-time applications such as audio calls or on-demand streaming, jitter can have an adverse effect on the objective and perceived quality of the media as played out at the receive side. For example this may be manifested as a certain “jerkiness” in the play out.
To counter this effect, many receiving devices are equipped with a jitter buffer. A jitter buffer works by buffering the incoming stream and introducing an extra, deliberate delay—the de-jittering delay—between the receipt of data into the buffer from the network and the output of that data from the buffer to be played out. The maximum jitter (i.e. maximum variation in network delay) that the jitter buffer can accommodate is equal to the de-jittering delay. As long as the peak deviation in the network delay does not rise above the length of the de-jittering delay, the decoder will always have a supply of data in the de-jittering buffer to continue decoding and playing out through the receiving device. However, if the deviation in network delay does exceed the length of the de-jittering delay, the decoder will run out of data to decode and instead a concealment algorithm will have to be invoked until more data is received, which will typically generate unnatural sounding artefacts. Hence there is an advantage in introducing a deliberate delay in the form of the jitter buffer.
However, in real-time applications, absolute playout delay can also have a significant effect on the objective and perceptual quality. For example in the case of a call, a delay in the audio may leave the receiving user with sense of unresponsiveness, and the two users may find themselves talking across one another. The delay of the jitter-buffer may therefore be designed to strike a balance between audio play-out delay and audio play-out jitter (delay variations). The jitter buffer may also be configured to dynamically adapt the jitter delay in dependence on channel conditions experienced by the stream over the network. Hence jitter-buffer design is usually concerned with two main problems: (i) characterization of the impact of play-out delay and play-out jitter on perceptual audio quality and (ii) dynamic estimation and prediction of audio-data transmission-jitter and loss in the transmission medium.