The present invention relates to communication of digital audio data. More particularly, the present invention relates to modification of digital audio playback to compensate for timing differences.
Technology currently exists that allows two or more computers to exchange real time audio and video data over a network. This technology can be used, for example, to provide video conferencing between two or more locations connected by the Internet. However, because participants in the conference use different computer systems, the sampling rates for audio input and output may differ.
For example, two computer systems having sampling rates labeled xe2x80x9c8 kHzxe2x80x9d may have slightly different actual sampling rates. Assuming that a first computer has an actual audio input sampling rate of 8.1 kHz and a second computer has an actual audio output rate of 7.9 kHz, the computer system outputting the audio data is falling behind the input computer system at a rate of 200 samples per second. The result can be unnatural gaps in audio output or loss of audio data. Over an extended period of time, audio output may fall behind video output such that the video output has little relation to the audio output.
Another shortcoming of real time network audio is known as xe2x80x9cjitter.xe2x80x9d As network routing paths or packet traffic volume change, as is common with the Internet, a short interruption may be experienced as a result of the time difference required to traverse a first route as compared to a second route. The resulting jitter can be annoying or distracting to a listener of the digital audio received over the network.
What is needed is an audio compensation scheme that compensates for audio timing differences between input and output.
A method and apparatus for digital audio compensation is described. A timing relationship between an audio input and an audio output is determined. A period of silence within an audio segment is identified and the length of the period of silence is adjusted based, at least in part, on the timing relationship between the audio input and the audio output.
In one embodiment, the timing relationship is determined based on a difference between time stamps for a first data packet and a second data packet, and a period of time required to play the first data packet. In one embodiment, audio samples from the period of silence are removed or replicated to shorten or lengthen, respectively, the period of silence to compensate for differences between the audio input and the audio output. Modification of the period of silence can be used to compensate for both differences between input and output rates and for jitter caused by network routing.