The present application is related to U.S. application entitled xe2x80x9cLow-latency Buffering for Packet Telephony,xe2x80x9d which is filed on even date herewith. These two applications are co-pending and commonly assigned.
This invention relates to packet telephony in general and, more particularly, provides a way of reducing latency in packet telephony communications.
Packet telephony involves the use of a packet network, such as the Internet or an xe2x80x9cintranetxe2x80x9d (modeled in functionality based upon the Internet and used by a companies locally or internally) for telecommunicating voice, pictures, moving images and multimedia (e.g., voice and pictures) content. Instead of a pair of telephones connected by switched telephone lines, however, packet telephony typically involves the use of a xe2x80x9cpacket phonexe2x80x9d or xe2x80x9cInternet phonexe2x80x9d at one or both ends of the telephony link, with the information transferred over a packet network using packet switching techniques. A xe2x80x9cpacket phonexe2x80x9d or xe2x80x9cInternet phonexe2x80x9d typically includes a personal computer (PC) running application software for implementing packetized transmission of audio signals over a packet network (such as the Internet); in addition, the PC-based configuration of a packet or Internet phone typically includes additional hardware devices, such as a microphone, speakers and a sound card, which are plugged or incorporated into the PC.
The amount of time it takes for a communication to travel through a communications network is referred to as latency. The amount of latency can impact the quality of the communication; the higher the latency, the lesser the quality of the communication. Latency of about 150 milliseconds (ms) or more produces a noticeable effect upon conversations that, for some people, can render a conversation next to impossible. The Plain Old Telephone Service (POTS) network controls latency to an acceptable degree, which is one of the ways in which the POTS network is deemed a reliable and quality communications service.
However, latency is a significant problem in packet telephony. Latency problems may be caused by factors such as traffic congestion or bottlenecks in the packet network, which can delay delivery of packets to the destination.
One source of latency comes from the data buffers typically used with sound cards employed in PC-based packet telephone applications. These buffers, which are used to accompany the process of converting analog audio signals into digital audio data (and vice-versa), are set to a size determined by the operating system for the PC. Until recently, these buffers were of a fixed size; a recent revision in the Microsoft Windows(trademark) operating system now permits variable-size buffers. However, by virtue of the fact that audio data is not normally clocked out of the buffers until the buffer fills, there is a latency introduced between the time the data enters the buffer and the time at which the data exits the buffer; that is, the audio data will reside in the buffer for a period of time equivalent to the xe2x80x9clengthxe2x80x9d of the buffer. Accordingly, perceptible latency is introduced as a result of this buffering, often making interactive conversations difficult or unnatural (particularly where the buffer size is poorly xe2x80x9ctunedxe2x80x9d for packet telephony).
What is desired is a way of reducing the latency in packet telephony communications caused by buffering accompanying the analog-digital conversion process in sound cards.
The present invention is directed to a method for reducing latency in packet telephony introduced by data buffering in the analog-digital conversion process. In handling speech to be output to the packet network, the analog signal from the microphone is sampled at a sampling rate far exceeding the rate necessary for transmitting telephony-grade voice signals. The increased sampling rate allows the audio data to pass much more rapidly through the data conversion buffer. After passing through the buffer, the data is downsampled to a rate normally used for telephony. To handle audio data input from the packet network for playing over the PC speaker, the data is upsampled to a rate far in excess of the rate necessary for processing telephony-grade voice signals. The increased sampling rate allows the audio data to pass much more rapidly through the data conversion buffer. After passing through the buffer, the data is converted into an analog audio signal for sending to the speaker. In this way, latency due to the buffering that accompanies the process of converting audio signals to digital data, or vice versa, is minimized.