In a typical speech processing system, a microphone is used to recover the speech. The microphone produces an analog signal corresponding to the acoustic vibrations it receives. However, some environments are so noisy that the microphone input signal cannot be understood. These extremely noisy environments may also produce noise which is constantly changing and thus is very difficult to filter. Cellular telephones, cordless telephones, and mobile radios are frequently used in environments with high noise levels.
One technique for discerning speech in these extremely noisy environments is to use two input sensors, such as two microphones or a microphone and an accelerometer. The inputs from the two sensors are filtered in analog filters, weighted, and combined to produce an enhanced speech signal. See, for example, Viswanathan et al., "Noise Immune Speech Transduction Using Multiple Sensors," IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. ICASSP-85, pp. 19.1.1-19.1.4, March 1985. Another technique uses a first microphone placed in proximity to the speaker, to recover a speech signal having a large noise component. See S. Boll and D. Pulsipher, "Suppression of Acoustic Noise in Speech Using Two Microphone Adaptive Noise Cancellation," IEEE Transactions on Acoustics, Speech and Signal Processing, vol. ASSP-28, no. 6, pp. 752-754, December 1980. A second microphone is physically isolated from the speaker so as to recover primarily the noise but not the speech. The noise component is subtracted from the input of the first microphone in an adaptive filter in order to recover the speech signal with an improved signal-to-noise ratio (SNR). While both of these techniques are able to improve the SNR in extremely noisy environments, more improvement is desirable.
In addition, if adaptive filtering is used, it is impossible to arrive at an optimum filter response using conventional adaptive filtering techniques. The result is that the filter is either sometimes over-responsive or sometimes under-responsive. An adaptive filter with a least-mean-squares (LMS) predictor, such as the filter used by Boll and Pulsipher, has this problem. A known variant of the LMS technique, the normalized LMS (NLMS) predictor, also has this problem. The NLMS predictor is able to compensate for large changes in signal power by normalizing filter coefficients in relation to the magnitude of the expected signal power. Thus, for example, the NLMS predictor can adapt the filter coefficients at large signal power as accurately as at low signal power. However, the responsiveness of the NLMS predictor depends on the value of a smoothing parameter .beta., which ranges from 0 to 1. There is a tradeoff in filter responsiveness depending on the value of .beta.. If .beta. is too small, i.e. too much less than 1, then the filter is over-responsive, leading to unstable response. If .beta. is too large, i.e. too close to 1, however, the filter is under-responsive and rapid changes in the input signal power are reflected in the output only very slowly. Thus, both a speech processing system which works well in extremely noisy environments and an adaptive filter which has better responsiveness are needed.