In telecommunication systems the voice or tonal signal as received by the receiver often contains strong background noise. In some cases the signal is intelligible only with difficulty and the listening process requires disagreeably high concentration. The voice quality is reduced by interference caused by an actual quality decrease in the transmission channel, such as transmission errors and multipath propagation. Also often noise is caused, to some degree, by powerful digital voice compression.
Noise occurring in the telecommunication channel may be caused by poor channel quality and voice signal compression, but also by loud audible noise around the speaker at the transmitting end. For instance in a moving car the interaction of motor and tires generates strong background noise, which is heard along with the user's voice through a mobile telephone.
The background noise can be reduced on the transmitting side by using separate microphones to measure the interference signal and by subtracting its effect from the voice signal before the voice is transmitted to the information channel. However, it is difficult to measure the noise signal using separate microphones, because there are generally several noise sources, and their mutual effect varies continuously. Thus a method with several microphones was not able to provide any good results.
The method in accordance with this invention is based on a more common situation in which no noise estimate measured by a second microphone is available. An estimate of the background noise is made directly from the noisy voice signal. In a noise suppression system of this kind a substantial part is formed by means, which measure the magnitude of the background noise and which eliminate its effect from the voice signal.
In order to make the noise suppression more effective many presented noise suppression systems rely on splitting the voice signal into frequency components before the background noise is measured. Then a Fourier analysis is performed on the voice signal (EP-367803, U.S. Pat. No. 5,012,519), or it is split into spectral components by using a filter group comprising band-pass filters (U.S. Pat. No. 4,628,529; 4,630,304; 4,630,305; FI-80173). Spectral ranges separated in this way can be processed so that the noise contents of the total signal is decreased. This is made by attenuating those bands in which the noise proportion is large compared to the net signal, and then forming the total signal by assembling the processed subsignals. A method based on such band-pass filtering is described i.e. in the paper R. K. McAulay, M. L. Malpass, "Speech Enhancement Using a Soft-Decision Noise Suppression Filter", IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-28, No. 2, April 1980.
Prior art is described below with reference to FIG. 1, which shows the block diagram of a prior art noise attenuation system based on band splitting.
FIG. 1 shows the block diagram of a prior art noise attenuation system based on band splitting. This system is based on splitting the voice signal into channels. Here the noisy signal 100 is supplied via a buffer 101 to a channel splitting filter group 102. The power of each channel is calculated in the channel power estimation block 103. The channel power can be estimated e.g. by full-wave rectification and a suitable subsequent low-pass filtering. The block 105, controlled by the voice activity monitoring block 104, performs the background noise estimation based on the channel power estimates during pauses in the speech. The signal of each channel is multiplied in the multiplication block 106 by a coefficient calculated in the gain calculation block 107 based on the channel power estimate and the noise power estimate, so that channels having a low signal to noise ratio and containing much background noise are attenuated before the channels are assembled in the filter group 108 to a total voice band signal 109. The background noise estimates of the channels are obtained by finding pauses in the voice signal, which is possible to perform e.g. by comparing the combined power value of the channels to a limit value, which is adaptively controlled (U.S. Pat. No. 4,628,529).
The background noise power estimation forms an essential part of the above mentioned noise attenuation systems. The background noise power must be estimated in order to obtain the attenuation required on different frequency bands. The attenuation coefficients can be calculated by a mathematical function based on the signal power and the noise power, or by finding discrete attenuation coefficients.
In previously presented systems the background noise estimation is made utilizing the speech pauses. These methods search pauses in the speech, during which the signal represents only background noise, and whereby it is possible to measure the background noise power. This can be made either as directly monitoring the speech activity, whereby it is first decided whether the signal contains voice or only background noise, the background noise quantity being measured during detected pauses, or indirectly by using the power estimate as the long interval minimum of the signal formed by voice and noise.
It is difficult to separate reliably from the voice such signal periods which contain only noise. It is particularly difficult to distinguish the background noise from toneless sounds which resemble noise. On the other hand, methods which estimate the background noise with the aid of speech pauses do not continuously monitor changes in the background noise under the voice signal, and therefore they are not able to react rapidly and they cannot eliminate rapidly changing noise.
In order to eliminate noise from the voice signal it has also been presented a so called comb filtering method (U.S. Pat. No. 4,852,169) based on the periodical nature of tonal sounds. The comb filtering method realizes a filter, which in the frequency plane is toothed at intervals of the basic frequency. The principle in its use is to pass spectral peaks, which are located at intervals of the basic frequency and contain the information contents of a tonal sound, and to block the propagation of frequency components between the tooths which contain only noise.
The difficulty in the use of a comb filter is that the basic frequency of the voice signal must be known in order to realize it. It is a difficult task to measure the basic frequency, because it constantly changes during the speech. If the basic frequency is not estimated correctly or accurately, the comb filter will not function in a desired way. Thus with comb filters no particularly good results are obtained in order to separate the voice signal and the noise.