The cellular telephone industry has made phenomenal strides in commercial operations in the United States as well as the rest of the world. Demand for cellular services in major metropolitan areas is outstripping current system capacity. Assuming this trend continues, cellular telecommunications will reach even the smallest rural markets. Consequently, cellular capacity must be increased while maintaining high quality service at a reasonable cost. One important step towards increasing capacity is the conversion of cellular systems from analog to digital transmission. This conversion is also important because the first generation of personal communication networks (PCNs), employing low cost, pocket-size, cordless telephones that can be easily carried and used to make or receive calls in the home, office, street, car, etc., will likely be provided by cellular carriers using the next generation digital cellular infrastructure.
Digital communication systems take advantage of powerful digital signal processing (DSP) techniques. Digital signal processing refers generally to mathematical and other manipulation of digitized signals. For example, after converting (digitizing) an analog signal into digital form, that digital signal may be filtered, amplified, and attenuated using simple mathematical routines in the DSP. Typically, DSPs are manufactured as high speed integrated circuits so that data processing operations can be performed essentially in real time. DSPs may also be used to reduce the bit transmission rate of digitized speech which translates into reduced spectral occupancy of the transmitted radio signals and increased system capacity. For example, if speech signals are digitized using 14-bit linear Pulse Code Modulation (PCM) and sampled at an 8 KHz rate, a serial bit rate of 112 Kbits/sec is produced. Moreover, by taking mathematical advantage of redundancies and other predicable characteristics of human speech, voice coding techniques can be used to compress the serial bit rate from 112 Kbits/sec to 7.95 Kbits/sec to achieve a 14:1 reduction in bit transmission rate. Reduced transmission rates translate into more available bandwidth.
One popular speech compression technique adopted in the United States by the TIA for use as the digital standard for the second generation of cellular telephone systems (i.e., IS-54), is vector sourcebook excited linear predictive coding (VSELP). Unfortunately, when audio signals including speech mixed with high levels of ambient noise (particularly "colored noise") are coded/compressed using VSELP, undesirable audio signal characteristics result. For example, if a digital mobile telephone is used in a noisy environment, (e.g. inside a moving automobile), both ambient noise and desired speech are compressed using the VSELP encoding algorithm and transmitted to a base station where the compressed signal is decoded and reconstituted into audible speech. When the background noise is reconstituted into an analog format, undesirable, audible "swirling" is produced which sounds to the listener like a strong wind blowing in the background of the speaker. The "swirling sounds", which are more technically termed modulated interference, are particularly irritating to the average listener.
In theory, various signal processing algorithms could be implemented using digital signal processors to filter the VSELP encoded background noise. This solution, however, requires significant digital signal processing overhead, measured in terms of millions of instructions executed per second (MIPS), which consumes valuable processing time, memory space, and power consumption. Each of these signal processing resources, however, is limited in portable radiotelephones. Hence, simply increasing the processing burden of the DSP is not an optimal solution for minimizing VSELP encoded background noise. What is needed is an adaptive noise reduction system that reduces the undesirable contributions of encoded background ambient noise but minimizes any increased drain on digital signal processor resources.
The present invention provides a method and system for adaptively reducing noise in audio signals which does not significantly increase signal processing overhead and therefore has particularly advantageous application to digital portable radiotelephones. Frames of digitized audio signals including both speech and background noise are processed in a digital signal processor to determine what attenuation (if any) should be applied to a current frame of digitized audio signals. Initially, it is determined whether the current frame of digitized audio signals includes speech information, this determination being based upon an estimate of noise and on a speech threshold value. An attenuation value determined for the previous audio frame is modified based on this determination and applied to the current frame in order to minimize the background noise which improves the quality of received speech. The attenuation applied to the audio frames is modified gradually on a frame-by-frame basis, and each sample in a specific frame is attenuated using the attenuation value calculated for that frame.
The energy of the current frame is determined by summing the square of the amplitude of each sample in that frame. When the frame energy exceeds the sum of a noise estimate (the running average of the frame energy over the last several frames) and the speech threshold value, it is determined that speech is present in the current frame. Regardless if speech is detected, a variable attenuation is applied to each sample in the current frame based on the current noise estimate. Particularly desirable results are obtained when the variable attenuation factor is determined based upon a logarithmic ratio of the noise estimate and a minimum noise threshold below which no attenuation is applied.
In addition to the variable attenuation determined for and applied to each frame, a second no speech attenuation value is calculated and further gradually applied to each frame where speech is not detected. Like the variable attenuation value, the no speech attenuation value may also be determined based on a logarithmic function. This ensures that the background noise detected between speech samples is maximally attenuated.
The adaptive noise reduction system according to the present invention may be advantageously applied to telecommunication systems in which portable/mobile radio transceivers communicate over RF channels with each other and with fixed telephone line subscribers. Each transceiver includes an antenna, a receiver for converting radio signals received over an RF channel via the antenna into analog audio signals, and a transmitter. The transmitter includes a coder-decoder (codec) for digitizing analog audio signals to be transmitted into frames of digitized speech information, the speech information including both speech and background noise. A digital signal processor processes a current frame based on an estimate of the background noise and the detection of speech in the current frame to minimize background noise. A modulator modulates an RF carrier with the processed frame of digitized speech information for subsequent transmission via the antenna.