This invention relates to a system for the suppression of noise, an accompanying method or a transceiver.
In many cases, noise corrupts a speech signal and hence significantly degrades the quality of recognition of the speech signal. An example for such noise is background noise intermingled with the speech signal acquired by a microphone, a hand-free phone, a handset or the like.
It is important to recognize speech in a noisy environment, e.g. a night club, a sport club, a Karaoke room, a hands-free communication system in a vehicle, especially a car, a helicopter, a tank or the like. Furthermore, noise suppression is useful in a live reporting system, a public addressing system or the like.
The recognition of speech or voice can be done by an automatic speech recognition system or by at least one human listener.
The undesirable background noise can be of different sources. For example, making telephone calls out of a driving car, the driving noise, especially the noise of the engine, is a dynamically varying kind of noise that results in poor recognition of the speech, particularly in a hands-free speaking environment of the car. The addressee permanently hears a contaminated acoustic signal, in which the voice of the driver is included but difficult to understand. As a consequence, the driver has to speak up or take the handset of the telephone, which binds his attention to the handset and not the traffic—a very undesirable effect. Another scenario relates to signals from an audio system that worsens the recognition of the speech intermingled with the audio noise.
Moreover, there are lots of sites which need better recognition of speech and/or better understanding because of a noisy background. Some sites, additional to the above mentioned scenarios, are: airplanes, helicopters, airports, trains, buses, train stations, bus stops, construction sites, highways, streets or the like.
In [1] a concept and basic approach for adaptive noise cancellation are given. It can be used to eliminate background noise and improve a signal-to-noise-ratio (SNR). Therefore, a primary input containing a corrupted signal and a reference input containing noise correlated in some unknown way with the primary noise are used. This reference input is adaptively filtered and subtracted from the primary input to obtain the signal estimate. Adaptive filtering before subtraction allows the treatment of inputs that are deterministic or stochastic, stationary or time variable. Wiener solutions are developed to describe asymptotic adaptive performance and output SNR for stationary stochastic inputs, including single and multiple reference inputs. These solutions show that, when the reference input is free of signal and certain other conditions are met, noise in the primary input can be essentially eliminated without signal distortion. Further, it is shown that in treating periodic interference, the adaptive noise canceler acts as a notch filter with narrow bandwidth, infinite null, and the capability of tracking the exact frequency of the interference; in this case, the canceler behaves as a linear, time-invariant system, with the adaptive filter converging on a dynamic rather than a static solution.
In [2] a voice operated switch in a noisy environment is described. This switch is capable of distinguishing between voice and non-voice (noise).
In [3] an approach to improve the basic idea of [1] to eliminate cross-talk effects between noise and speech signals is presented.
In [4] an adaptive noise suppressing device is introduced. Here, the characteristics of an adaptive filter are adjusted automatically dependent on variations of the input signal.
In [5] a system utilizing two specially-built microphones that have good near field response and poor far field response to produce signals with noise components having high correlations is disclosed.
Document [6] uses a filter bank for band-dividing the input signal from the main microphone and the second noise component from the reference microphone, and a noise cancelling circuit for obtaining a phase difference between the input signal and the second noise component with respect to each divided band of the filter bank so as to correct the input signal based on the phase difference and for cancelling the first noise component in the input signal by use of the corrected input signal.
In [7] Hunt adopts an adaptive filtering technique which is employed using the power spectra in both channels, i.e. in a speech channel and in a reference channel, when speech is not present in the speech channel to obtain a relationship between the environmental noise power spectra in the two channels. When speech is present in the speech channel, a prediction of the environmental noise power spectrum on that channel is obtained from the power spectrum of the noise on the reference channel and the relationship between the noise power spectra on the two channels previously obtained.
In [8] a method to adjust the updating step size of the adaptive filter is proposed so that the system has a better tracking ability while the desired speech does not exist, and otherwise has a smaller residual noise while the expected speech appears.
All of the above cited documents face the disadvantage that some kind of noise, e.g. noise of some sort of machine or noise of a loudspeaker, is not considered in an appropriate and favourable way.