1. Field of the Invention
The present invention relates to a method of frequency filtering implementing a Wiener filter.
It can be applied especially but not exclusively to noise suppression in sound signals containing speech picked up in noisy environments and more generally to noise suppression in all sound signals.
The main fields relate to telephone or radiotelephone communications, voice recognition, sound pick-up systems on civilian or military aircraft and, more generally, on all noisy vehicles, on-board intercommunications, etc.
As a non-restrictive example, in the case of an aircraft, noise results from the engines, the air-conditioning system, the ventilation of the on-board equipment or aerodynamic noise. All these noises are picked up, at least partially, by the microphone in which the pilot or any other member of the crew is speaking. Furthermore, for this type of application in particular, one of the characteristics of noises is that they are highly variable in time. Indeed, they are highly dependent on the operating conditions of the engines (take-off phase, stabilized state, etc.). The useful signals, namely the signals representing conversations, also have particular features: they are most usually short-lived.
Finally, whatever the application considered, if we look at the question of xe2x80x9cvoicingxe2x80x9d, it is possible to highlight certain particular features. As is known, voicing relates to elementary characteristics of portions of speech and more specifically to vowels as well as to some of the consonants: xe2x80x9cbxe2x80x9d, xe2x80x9cdxe2x80x9d, xe2x80x9cgxe2x80x9d, xe2x80x9cjxe2x80x9d, etc. These letters are characterized by an audiophonic signal with a pseudo-periodic structure.
In speech processing, it is common to consider that the stationary states, especially the above-mentioned voicing, are set up on durations of 10 to 20 ms. This time interval is characteristic of the elementary phenomena of the production of speech and shall hereinafter called a frame.
It is therefore common for the noise-suppression methods to take account of this major characteristic of sound signals comprising speech.
These methods generally comprise the following main steps: a subdivision into frames of the audiophonic signal to be subjected to noise suppression, the processing of these frames by a Fourier transform (or similar transform) operation in order to go into the frequency domain, the noise-suppression processing operation proper by means of digital filtering and a processing operation, that is dual to the first one, using a reverse Fourier transform is used to return to the temporal domain. The final step consists of a reconstruction of the signal. This reconstruction may be obtained by multiplying each of the frames by a weighting window.
One of the digital filters most commonly used for this type of application is the Wiener filter, especially a so-called optimal Wiener filter. This filter has the advantage of processing the successive frames in a differentiated way.
In other words, and more generally, the optimal Wiener filtering is at the center of the optimal signal processing methods based on second-order statistical characteristics and therefore on the notion of correlation.
Wiener filtering enables the separation of the signals by decorrelation. Its importance is related to the simplicity of the theoretical computations. Furthermore, it can be applied to a multitude of particular processes and especially, with regard to the preferred application aimed at by the invention, it can be applied to the removal of a noise that is polluting a speech signal.
2. Description of the Prior Art
However, in the prior art, a standard problem encountered during noise suppression by Wiener filtering is the presence of a noise, called a musical noise, that causes deterioration in the perception of the noise-suppressed signals, namely signals from which the noise has been cleared. This musical noise is due to the fluctuations of the spectral densities of the noise present in the input signal. For certain frames, indeed, the spectral density of the noise is greater, at least on one frequency channel, to that of the noise model used in these techniques. In this case, the mechanisms proper to the Wiener filtering prompt the appearance of a residual noise on the noise-suppressed signal. This residual noise is particularly unpleasant from the viewpoint of perception owing to its instability. Indeed, when listening to a speech signal, it is possible to distinguish residual noises in xe2x80x98rumblesxe2x80x99 similar to distortions that can be attributed to a high variability of the noise polluting the noise-suppressed speech signal or xe2x80x9cusefulxe2x80x9d signal.
The invention is therefore aimed at overcoming the drawbacks of the prior art filtering methods, especially the main drawback that has just been recalled: the presence of parasitic residual noise in the noise-suppressed signal, known as xe2x80x9cmusical noisexe2x80x9d. The invention is aimed more generally, in its main application, at increasing the intelligibility of speech.
In order to highly attenuate the effects of musical noise, the invention derives benefit from the following two experimental observations:
the probability of musical noise is all the greater as the estimate of the spectral density of the noise is unstable from one frame to another;
the probability of the presence of musical noise is all the greater as the estimate of the spectral density of the noise is small in comparison to its real spectral density.
According to a major characteristic of the invention, the Wiener filter used for the digital filtering is modified in an optimized way by the introduction therein of an energy compensation term aimed at overestimating the noise level. Furthermore, this compensation term is adaptive.
An object of the invention therefore is a method of frequency filtering for the removal of noise from noisy sound signals formed by sound signals called useful signals mixed with noise signals, the method comprising at least one step for the subdivision of said sound signals into a series of identical frames of a specified length and a step for frequency filtering by means of a Wiener filter, wherein the method furthermore comprises the following steps:
the preparation, from said noisy signals, of a model of noise on a specified number N of said frames, N being included between minimum and maximum predetermined limits;
the application of a Fourier transform to said N frames;
the estimation, for each frame of said model, of the spectral density of this frame;
the estimation of the mean spectral density of said noise model;
the computation, on the basis of these two estimations, of a statistical overestimation coefficient, said statistical coefficient being equal to the maximum ratio, for said N frames of the noise model, between the maximum spectral density of a considered frame of said noise model and the maximum estimated spectral density of the noise model;
the estimation, for each frame of said signals to be noise-suppressed, namely cleared of noise, of its spectral density; and
the modification, for each frame of said signals to be noise-suppressed, of the coefficients of said Wiener filter so that the following relationship is verified:       W    ⁡          (      v      )        =            (              1        -                  α          ·          max          ·                                                    γ                x                            ⁡                              (                v                )                                                                    γ                u                            ⁡                              (                v                )                                                        )              β      xe2x80x2      
wherein xcex1 and xcex2 are predetermined fixed coefficients known as a static energy compensation coefficient and a exponential attenuation coefficient respectively, xcexd describes all the frequency channels of said Fourier transform, xcex3u(xcexd) being the estimate of the spectral density of the frame to be noise-suppressed, xcex3x(xcexd) is said spectral density of the noise model and max is said statistical overestimation coefficient modifying the static coefficient of energy compensation xcex1.