This invention relates to communication system noise cancellation techniques, and more particularly relates to weighting calculations used in such techniques.
The need for speech quality enhancement in single-channel speech communication systems has increased in importance especially due to the tremendous growth in cellular telephony. Cellular telephones are operated often in the presence of high levels of environmental background noise, such as in moving vehicles. Such high levels of noise cause significant degradation of the speech quality at the far end receiver. In such circumstances, speech enhancement techniques may be employed to improve the quality of the received speech so as to increase customer satisfaction and encourage longer talk times.
Most noise suppression systems utilize some variation of spectral subtraction. FIG. 1A shows an example of a typical prior noise suppression system that uses spectral subtraction. A spectral decomposition of the input noisy speech-containing signal is first performed using the Filter Bank. The Filter Bank may be a bank of bandpass filters (such as in reference [1], which is identified at the end of the description of the preferred embodiments). The Filter Bank decomposes the signal into separate frequency bands. For each band, power measurements are performed and continuously updated over time in the Noisy Signal Power and Noise Power Estimation block. These power measures are used to determine the signal-to-noise ratio (SNR) in each band. The Voice Activity Detector is used to distinguish periods of speech activity from periods of silence. The noise power in each band is updated primarily during silence while the noisy signal power is tracked at all times. For each frequency band, a gain (attenuation) factor is computed based on the SNR of the band and is used to attenuate the signal in the band. Thus, each frequency band of the noisy input speech signal is attenuated based on its SNR.
FIG. 1B illustrates another more sophisticated prior approach using an overall SNR level in addition to the individual SNR values to compute the gain factors for each band. (See also reference [2].) The overall SNR is estimated in the Overall SNR Estimation block. The gain factor computations for each band are performed in the Gain Computation block. The attenuation of the signals in different bands is accomplished by multiplying the signal in each band by the corresponding gain factor in the Gain Multiplication block. Low SNR bands are attenuated more than the high SNR bands. The amount of attenuation is also greater if the overall SNR is low. After the attenuation process, the signals in the different bands are recombined into a single, clean output signal. The resulting output signal will have an improved overall perceived quality.
The decomposition of the input noisy speech-containing signal can also be performed using Fourier transform techniques or wavelet transform techniques. FIG. 2 shows the use of discrete Fourier transform techniques (shown as the Windowing and FFT block). Here a block of input samples is transformed to the frequency domain. The magnitude of the complex frequency domain elements are attenuated based on the spectral subtraction principles described earlier. The phase of the complex frequency domain elements are left unchanged. The complex frequency domain elements are then transformed back to the time domain via an inverse discrete Fourier transform in the IFFT block, producing the output signal. Instead of Fourier transform techniques, wavelet transform techniques may be used for decomposing the input signal.
A Voice Activity Detector is part of many noise suppression systems. Generally, the power of the input signal is compared to a variable threshold level. Whenever the threshold is exceeded, speech is assumed to be present. Otherwise, the signal is assumed to contain only background noise. Such two-state voice activity detectors do not perform robustly under adverse conditions such as in cellular telephony environments. An example of a voice activity detector is described in reference [5].
Various implementations of noise suppression systems utilizing spectral subtraction differ mainly in the methods used for power estimation, gain factor determination, spectral decomposition of the input signal and voice activity detection. A broad overview of spectral subtraction techniques can be found in reference [3]. Several other approaches to speech enhancement, as well as spectral subtraction, are overviewed in reference [4].
Spectral weighting functions can improve the performance of some adaptive noise cancellation systems. In the past, deficiencies in such weighting functions have limited the effectiveness of known noise cancellation systems. For example, U.S. Pat. No. 4,630,305 (Borth et al., issued Dec. 16, 1986) describes an automatic gain selector for a noise suppression system based on an overall average background noise level of an input signal (See the Abstract.). This is a marked difference from the present invention which uses the normalized power of the noise signal component in one of the frequency bands into which the input signal is divided. This invention provides a solution not suggested by Borth et al.
The preferred embodiment is useful in a communication system for processing a communication signal comprising a speech component due to speech and a noise component due to noise. In such an environment, the preferred embodiment enhances the quality of the communication signal by dividing the communication signal into a plurality of frequency band signals representing the speech signal components and the noise signal components in a plurality of frequency bands, preferably by using a filter or a calculator employing, for instance, a Fourier transform. A plurality of weighting signals having weighting values derived from the frequency band signals are generated. The weighting values correspond to at least approximations of the normalized powers of the noise signal components in the frequency band signals. The frequency band signals are altered in response to the weighting signals to generate weighted frequency band signals. The weighted frequency band signals are combined to generate a communication signal with enhanced quality.
The calculations and signal generation described above preferably can be accomplished with a calculator.
By using the foregoing techniques, the weighting function needed to improve communication signal quality can be generated with a degree of ease and accuracy unattained by the known prior techniques.