1. Field of the Invention
The invention relates to a system and method for providing multimedia dynamic jitter buffer adjustment over packet-switched networks.
2. Brief Description of the Related Art
Networks carry three types of information: voice, video, and data. Historically, these different forms of information have been transported over different networks. Specifically, the telephone network delivered voice information; private corporate networks delivered data information; and broadcast networks delivered video information. Each service was provided by a specific form of infrastructure—the telephone network used copper wires to reach subscribers, broadcast television used the airwaves, cable television used coaxial cable, and so forth.
With advances in technology, the different forms of information can now be carried by any delivery platform. For example, telephony services (i.e., voice and facsimile) and video services can both be transported over data networks, such as the Internet.
“Internet telephony” refers to the transfer of voice information using the Internet protocol (IP) of the TCP/IP or UDP/IP protocol suite. Internet telephony uses the Internet to simulate a telephone connection between two Internet users and to bypass the local exchange carriers' and inter-exchange carrier's telephone networks. Internet telephony works by converting voices into data which can be compressed and split into packets. These data packets are sent over the Internet like any other packets and reassembled as audio output at the receiving end. The ubiquitous nature of the Internet allows a user to complete such Internet telephone connections to many countries around the world. Accordingly, by using the Internet to provide telephony services, the user can avoid paying per-minute toll charges assessed by the user's local exchange carrier and/or inter-exchange carrier. Rather, the user is subject to only his or her local Internet connection fees. The result may be considerable savings when compared to international telephone rates.
In addition, the Internet utilizes “dynamic routing,” wherein data packets are routed using the best routing available for a packet at a particular moment in time, given the current network traffic patterns. This system allows many different communications to be routed simultaneously over the same transmission facilities. In contrast, a circuit-switched telephone network, such as the public switched telephone network (PSTN), establishes dedicated, end-to-end transmission paths. Consequently, the Internet allows network resources to be used more efficiently than circuit-switched networks.
However, the advantages of reduced cost and bandwidth savings by using voice-over-packet networks are associated with quality-of-service (QoS) problems, such as latency, packet loss, and jitter. On the Internet, data packets from the same voice conversation can take very different routes from second to second and are likely to arrive at their destination out of order than originally transmitted, late, or not at all. Further, the data stream is not uniform—data packets carrying the voice conversation can arrive at the destination at irregular intervals, as shown in FIG. 1. If a packet arrives slightly late, the audio device which is ready to play the next frame of audio has nothing to play. This causes a short silent period that makes the voice sound choppy or garbled. Such discontinuity degrades the audio quality and decreases the desirability of using the Internet to conduct voice communications.
To compensate for the irregular arrival intervals of data packets (or jitter) inherent in packet-switched networks, a jitter buffer is provided with the device. A conventional jitter buffer has a fixed size, or depth. The jitter buffer stores, or buffers, an amount of incoming data packets for a specified amount of time before sending them in a more constant stream, thereby ensuring that there will be data to play and producing a more even flow of data. However, every frame buffered adds latency, or delay, which is especially relevant to voice calls. For example, if the jitter buffer setting is 50 milliseconds (msec) of data, 50 msec of delay is introduced between the time the words are spoken and when they are heard. When added to the latency of the Internet itself, this can rapidly become unacceptable. Typically, people can tolerate delays not exceeding 200 msec to 250 msec before the conversation becomes annoying. FIG. 2 shows a table of examples of the one-way delay budget with typical values. Given that many Internet locations are 100 msec or more (one way) apart, adding 50 msec of jitter buffer latency accounts for a significant fraction of the acceptable delay.
A delicate balance lies between the need to eliminate jitter and the need to reduce latency. Further, the network traffic condition varies continuously. Accordingly, when network traffic is low, the jitter buffer may be too large, thereby introducing unnecessary latency. However, when network load is high, the jitter buffer may be too small such that network perturbations, for example, packet loss and jitter, will cause audible distortion on the voice conversation. In addition, for Internet devices having a fixed jitter buffer depth, when the jitter buffer is too large, the unused memory resources in the system are not available to perform other functions.
Accordingly, it would be desirable to provide a system having jitter buffer adjustment for multimedia applications that addresses the drawbacks of known systems.