1. Field of the Invention
The present invention is related to the field of communications through networks, and more specifically to devices, softwares and methods for selectively discarding indicated ones of voice data packets received in a jitter buffer.
2. Description of the Related Art
Networks, such as the internet, are increasingly used for communications. The Internet Protocol (IP) has been developed for communications through the internet.
As of recently, networks are used for transporting also video data and voice data. The latter takes place using a Voice over Internet Protocol (VoIP). Voice data packets are generated at a steady rate, and then transmitted through the network. If any are lost, they are not retransmitted, and will not received by the intended network appliance which is at the network endpoint. If they are not received, or arrive too late, they are not incorporated in the playout by the network appliance.
The voice data packets arrive at the internet appliance, and are then stored in a specially allocated portion of its memory, which is called the jitter buffer. Then they are played out of the jitter buffer as sound. For playout, the voice data packets are taken in their proper order and at a steady rate.
When the network is congested, there are longer delays between transmission and reception. In addition the voice data packets tend to arrive more in bursts (concentrated groups, then nothing), instead of at a steady rate. Since playout must happen at a steady rate, the jitter buffer size must be increased when network congestion is detected. When it is increased, there is a longer overall delay in receiving sound from the source, which reduces the quality of service (QoS).
Adaptive dejitter algorithms are being developed for dynamically optimizing the QoS. When these detect that the network is becoming less congested, then they also reduce the size of the data buffer. This reduces the overall delay, thus improving the QoS.
Reducing the size of the data buffer entails discarding voice packets from the jitter buffer. Plus, there will be a period of adjustment to the lesser delay. During that short period, the time axis of playout is compressed. This means that fewer packets will be played out than were correspondingly received. This results in noticeable degradation of the voice quality (and thus also of the QoS) during the delay adjustment period.
The degradation takes place because the time axis will be compressed. But it is worse because the choice of which voice packets to discard is random. That is because, for reconstructing speech, some packets are perceptually more important than others. But their relative importance is not accounted for in the discard decisions of the network appliance. Accordingly, the important packets have an equal chance of being discarded as the less important packets. Thus the playout during the adjustment period can have a poor quality, even if only few packets are being discarded.