I. Field of the Invention
The present invention pertains generally to the field of wireless communications, and more specifically to providing an efficient method and apparatus for reducing voice latency associated with a voice-over-data wireless communication system.
II. Background
The field of wireless communications has many applications including cordless telephones, paging, wireless local loops, and satellite communication systems. A particularly important application is cellular telephone systems for mobile subscribers. (As used herein, the term xe2x80x9ccellularxe2x80x9d systems encompasses both cellular and PCS frequencies.) Various over-the-air interfaces have been developed for such cellular telephone systems including frequency division multiple access (FDMA), time division multiple access (TDMA), and code division multiple access (CDMA). In connection therewith, various domestic and international standards have been established including Advanced Mobile Phone Service (AMPS), Global System for Mobile (GSM), and Interim Standard 95 (IS-95). In particular, IS-95 and its derivatives, such as IS-95A, IS-95B (often referred to collectively as IS-95), ANSI J-STD-008, IS-99, IS-657, IS-707, and others, are promulgated by the Telecommunication Industry Association (TIA) and other well known standards bodies.
Cellular telephone systems configured in accordance with the use of the IS-95 standard employ CDMA signal processing techniques to provide highly efficient and robust cellular telephone service. An exemplary cellular telephone system configured substantially in accordance with the use of the IS-95 standard is described in U.S. Pat. No. 5,103,459 entitled xe2x80x9cSystem and Method for Generating Signal Waveforms in a CDMA Cellular Telephone Systemxe2x80x9d, which is assigned to the assignee of the present invention and incorporated herein by reference. The aforesaid patent illustrates transmit, or forward-link, signal processing in a CDMA base station. Exemplary receive, or reverse-link, signal processing in a CDMA base station is described in U.S. application Ser. No. 08/987,172, filed Dec. 9, 1997, entitled MULTICHANNEL DEMODULATOR, which is assigned to the assignee of the present invention and incorporated herein by reference. In CDMA systems, over-the-air power control is a vital issue. An exemplary method of power control in a CDMA system is described in U.S. Pat. No. 5,056,109 entitled xe2x80x9cMethod and Apparatus for Controlling Transmission Power in A CDMA Cellular Mobile Telephone Systemxe2x80x9d which is assigned to the assignee of the present invention and incorporated herein by reference.
A primary benefit of using a CDMA over-the-air interface is that communications are conducted simultaneously over the same RF band. For example, each mobile subscriber unit (typically a cellular telephone) in a given cellular telephone system can communicate with the same base station by transmitting a reverse-link signal over the same 1.25 MHz of RF spectrum. Similarly, each base station in such a system can communicate with mobile units by transmitting a forward-link signal over another 1.25 MHz of RF spectrum.
Transmitting signals over the same RF spectrum provides various benefits including an increase in the frequency reuse of a cellular telephone system and the ability to conduct soft handoff between two or more base stations. Increased frequency reuse allows a greater number of calls to be conducted over a given amount of spectrum. Soft handoff is a robust method of transitioning a mobile unit between the coverage area of two or more base stations that involves simultaneously interfacing with two or more base stations. (In contrast, hard handoff involves terminating the interface with a first base station before establishing the interface with a second base station.) An exemplary method of performing soft handoff is described in U.S. Pat. No. 5,267,261 entitled xe2x80x9cMobile Station Assisted Soft Handoff in a CDMA Cellular Communications Systemxe2x80x9d which is assigned to the assignee of the present invention and incorporated herein by reference.
Under Interim Standards IS-99 and IS-657 (referred to hereinafter collectively as IS-707), an IS-95-compliant communications system can provide both voice and data communications services. Data communications services allow digital data to be exchanged between a transmitter and one or more receivers over a wireless interface. Examples of the type of digital data typically transmitted using the IS-707 standard include computer files and electronic mail.
In accordance with both the IS-95 and IS-707 standards, the data exchanged between a transmitter and a receiver is processed in discreet packets, otherwise known as data packets or data frames, or simply frames. To increase the likelihood that a frame will be successfully transmitted during a data transmission, IS-707 employs a radio link protocol (RLP) to track the frames transmitted successfully and to perform frame retransmission when a frame is not transmitted successfully. Re-transmission is performed up to three times in IS-707, and it is the responsibility of higher layer protocols to take additional steps to ensure that frames are successfully received.
Recently, a need has arisen for transmitting audio information, such as voice, using the data protocols of IS-707. For example, in a wireless communications system employing cryptographic techniques, audio information may be more easily manipulated and distributed among data networks using a data protocol. In such applications, it is desirable to maintain the use of existing data protocols so that no changes to existing infrastructure are necessary. However, problems occur when transmitting voice using a data protocol, due to the nature of voice characteristics.
One of the primary problems of transmitting audio information using a data protocol is the delays associated with frame re-transmissions using an over-the-air data protocol such as RLP. Delays of more than a few hundred milliseconds in speech can result in unacceptable voice quality. When transmitting data, such as computer files, time delays are easily tolerated due to the non real-time nature of data. As a consequence, the protocols of IS-707 can afford to use the frame re-transmission scheme as described above, which may result in transmission delays, or a latency period, of more than a few seconds. Such a latency period is unacceptable for transmitting voice information.
What is needed is a method and apparatus for minimizing the problems caused by the time delays associated with frame retransmission requests from a receiver. Furthermore, the method and apparatus should be backwards-compatible with existing infrastructure to avoid expensive upgrades to those systems.
The present invention is a method and apparatus for reducing voice latency, otherwise known as communication channel latency, associated with a voice-over-data wireless communication system. Generally, this is achieved by dropping data frames at a transmitter, a receiver, or both, without degrading perceptible voice quality.
In a first embodiment of the present invention, in a voice-over-data communication system, data frames are dropped in a transmitter at a fixed, predetermined rate prior to storage in a queue. Audio information, such as voice, is transformed into data frames by a voice-encoder, or vocoder, at a fixed rate, in the exemplary embodiment every 20 milliseconds. The data frames are stored in a queue for use by further processing elements. A processor located within the transmitter prevents data frames from being stored in the queue at a fixed, predetermined rate. This is known as frame dropping. As a result of fewer data frames being stored in the queue, fewer data frames representing the audio information are transmitted to the receiver, thereby alleviating the problem of communication channel latency between transmitter and receiver due to poor communication channel quality.
At the receiver, data frames are received, demodulated, and placed into a queue for use by a voice decoder. Data frames are withdrawn from the queue by the voice decoder at the same fixed rate as they were generated at the transmitter, i.e., every 20 milliseconds in the exemplary embodiment. Occasionally, the size of the queue will vary dramatically due to poor communication channel quality. Under such circumstances, frame retransmissions from the transmitter to the receiver occur, causing an overall increase in the number of data frames ultimately used by the voice decoder. The increased size of the queue causes subsequent frames added to the queue to be delayed from reaching the voice decoder, resulting in increased communication channel latency. The present invention reduces this latency by transmitting fewer data frames to represent the audio information. Thus, during periods of poor communication channel quality, the size of the receive queue is held to a reasonable size, preventing an unreasonable amount of communication channel latency.
In a second embodiment of the present invention, data frames are dropped at a transmitter at either one of two rates, depending on the communication channel latency which relates to the quality of the communication channel. A first rate is used if the communication channel latency is within reasonable limits, i.e., little or no perceptible voice latency. A second, higher rate is used when it is determined that the communication channel latency is sufficiently noticeable. In this embodiment, as in the first embodiment, audio information is transformed into data frames by a voice-encoder, or vocoder, at a fixed rate, in the exemplary embodiment every 20 milliseconds. Under normal channel conditions, where the communication channel latency is within an acceptable range, data frames are dropped at a first, fixed rate. Data frames are dropped at a second, higher rate if a processor determines that the communication channel latency has increased significantly. This embodiment reduces the communication channel latency quickly during bursty channel error conditions where latency can increase rapidly.
In a third embodiment of the present invention, communication channel latency is reduced by dropping data frames at the transmitter at a variable rate, depending on the communication channel latency. In this embodiment, a processor located within the transmitter determines the communication channel latency using one of several possible techniques. If the processor determines that the communication channel latency has changed, frames are dropped at a rate proportional to the level of communication channel latency. As latency increases, the frame dropping rate increases. As latency decreases, the frame dropping rate decreases. As in the first two embodiments, communication channel latency increases when the communication channel quality decreases. This is due primarily to increased frame re-transmissions which occur as the communication channel quality decreases.
In a fourth embodiment, data frames are dropped in accordance with the rate at which the data frames were encoded by a voice-encoder. In this embodiment, a variable-rate vocoder is used to encode audio information into data frames at varying data rates, in the exemplary embodiment, four rates: full rate, half rate, quarter rate, and eighth rate. A processor located within the transmitter determines the communication channel latency using one of several possible techniques. If the processor determines that the communication channel latency has increased beyond a predetermined threshold, eighth-rate frames are dropped as they are produced by the vocoder. If the processor determines that the communication channel latency has increased beyond a second predetermined threshold, both eighth rate and quarter-rate frames are dropped at they are produced by the vocoder. Similarly, half rate and full rate frames are dropped as the communication channel latency continues to increase.
In a fifth embodiment of the present invention, data frames are dropped at the receiver either alone, or in combination with frame dropping at a transmitter. The fifth embodiment can be implemented using any of the above embodiments. For example, data frames can be dropped using a single, fixed rate, two fixed rates, or a variable rate, and can further incorporate the fourth embodiment, where frames are dropped in accordance with their rate at which the data frames have been encoded by the vocoder residing at the transmitter.
In a sixth embodiment, frame dropping is performed at the receiver. Receiver frame dropping is usually performed based on a queue length compared to a queue threshold. In the sixth embodiment, the queue threshold dynamically adjusted to maintain a constant level of voice quality.