A cellular telephone system comprises three essential elements: a cellular switching system that serves as the gateway to the landline (wired) telephone network, a number of base stations under the switching system's control that contain equipment that translates between the signals used in the wired telephone network and the radio signals used for wireless communications, and a number of mobile telephone units that translate between the radio signals used to communicate with the base stations and the audible acoustic signals used to communicate with human users (e.g. speech, music, etc.).
Communication between a base station and a mobile telephone is possible only if both the base station and the mobile telephone use identical radio modulation schemes, data-encoding conventions, and control strategies, i.e. both units must conform to an air-interface specification. A number of standards have been established for air-interfaces in the United States. Until recently, all cellular telephony in the United States has operated according to the Advanced Mobile Phone Service (AMPS) standard. This standard specifies analog signal encoding using frequency modulation in the 800 MHz region of the radio spectrum. Under this scheme, each cellular telephone conversation is assigned a communications channel consisting of two 30 KHz segments of this region for the duration of the call. In order to avoid interference between conversations, no two conversations may occupy the same channel simultaneously within the same geographic area. Since the entire portion of the radio spectrum allocated to cellular telephony is finite, this restriction places a limit on the number of simultaneous users of a cellular telephone system.
In order to increase the capacity of the system, a number of alternatives to the AMPS standard have been introduced. One of these is the Interim Standard-54 (IS-54), issued by the Electronic Industries Association and the Telecommunications Industry Association. This standard makes use of digital signal encoding and modulation using a time division multiple access (TDMA) scheme. Under the TDMA scheme, each 30 KHz segment is shared by three simultaneous conversations, and each conversation is permitted to use the channel one-third of the time. Time is divided into 20 ms frames, and each frame is further sub-divided into three time slots. Each conversation is allotted one time slot per frame.
To permit all of the information describing 20 ms of conversation to be conveyed in a single time slot, speech and other audio signals are processed using a digital speech compression method known as Vector Sum Excited Linear Prediction (VSELP). Each IS-54 compliant base station and mobile telephone unit contains a VSELP encoder and decoder. Instead of transmitting a digital representation of the audio waveform over the channel, the VSELP encoder makes use of a model of human speech production to reduce the digitized audio signal to a set of parameters that represent the state of the speech production mechanism during the frame (e.g. the pitch, the vocal tract configuration, etc.). These parameters are encoded as a digital bit-stream, and are then transmitted over the channel to the receiver at 8 kilobits per second (kbs). This is a much lower bit rate than would be required to encode the actual audio waveform. The VSELP decoder at the receiver then uses these parameters to re-create an estimate of the digitized audio waveform. The transmitted digital speech data is organized into digital information frames of 20 ms, each containing 160 samples. There are 159 bits per speech frame. The VSELP method is described in detail in the document, TR45 Full-Rate Speech Codec Compatibility Standard PN-2972, 1990, published by the Electronics Industries Association, which is fully incorporated herein by reference (hereinafter referred to as "VSELP Standard").
VSELP significantly reduces the number of bits required to transmit audio information over the communications channel. However, it achieves this reduction by relying heavily on a model of speech production. Consequently, it renders non-speech sounds poorly. For example, the interior of a moving automobile is an inherently noisy environment. The automobile's own sounds combine with external noises to create an acoustic background noise level much higher than is typically encountered in non-mobile environments. This situation forces VSELP to attempt to encode non-speech information much of the time, as well as combinations of speech and background noise.
Two problems arise when VSELP is used to encode speech in the presence of background noise. First, the background noise sounds unnatural whether or not there is speech present, and second, the speech is distorted in a characteristic way. Individually and collectively these problems are commonly referred to as "swirl".
While it would be possible to eliminate these artifacts introduced by the encoding/decoding process by replacing the VSELP algorithm with another speech compression algorithm that does not suffer from the same deficiencies, this strategy would require changing the IS-54 Air Interface Specification. Such a change is undesirable because of the considerable investment in existing equipment on the part of cellular telephone service providers, manufacturers and subscribers. For example, in one prior art technique, the speech encoder detects when no speech is present and encodes a special frame to be transmitted to the receiver. This special frame contains comfort noise parameters which indicate that the speech decoder is to generate comfort noise which is similar to the background noise on the transmit side. These special frames are transmitted periodically by the transmitter during periods of non-speech. This proposed solution to the swirl problem requires a change to the current VSELP speech algorithm because it introduces special encoded frames to indicate when comfort noise is to be generated. It is implemented at both the transmit and receive sides of the communication channel, and requires a change in the current air interface specification standard. It is therefore an undesirable solution.