The present invention relates to a method for reducing interference in acoustic signals using of an adaptive filtering method involving spectral subtraction.
Use of an adaptive filtering method involving spectral subtraction for reducing interference is described, for example, in Boll, xe2x80x9cSuppression of Acoustic Noise in Speech using Spectral Subtractionxe2x80x9d; IEEE Trans. Acoust. Speech a. Signal Processing, Vol. ASSP-27, No. 2, p. 113-120, 1979.
The improvement of speech signals is a central part of the current research in the field of communications technology, for example, also in fields of application such as handsfree talking in vehicles or in automatic speech recognition. For the improvement of speech signals, it is above all essential to reduce the disturbing noises.
A method frequently used for reducing noise is the xe2x80x9cspectral subtractionxe2x80x9d whose basic principles are described, for example, Boll supra.
The spectral subtraction is an adaptive filter which ascertains (learns) an average value of the noise spectrum during speech pauses, and continually subtracts this spectrum from the disturbed speech signal. The exact embodiment of the subtraction of the interference spectrum can be varied depending on the requirement. Individual examples are depicted in the following.
As a rule, the filtering method of spectral subtraction is carried out within the frequency range. The signals a transformed segmentwise into the frequency range by an FFT (Fast Fourier Transform). The corresponding segments of the signal in the time range are half overlapped, and are previously multiplied by a Hanning window. The synthesis is carried out after the filtering (multiplication) and subsequent inverse transformation by the xe2x80x9coverlap-add methodxe2x80x9d.
In Linhard, xe2x80x9cAdaptive Gerauschreduktion im Frequenzbereich bei Sprachutbertragungxe2x80x9d; Dissertation Universitat Karslruhe, 1988 [Adaptive Noise Reduction within the Frequency Range During Speech Transmission; dissertation, University of Karlsruhe, 1988] three standard filter curves are depicted as exemplary embodiments for the spectral subtraction:
xe2x80x83Power Subtraction: H(k,i)=max(b, {square root over (1xe2x88x92xcex1xc2x7NIR)})xe2x80x83xe2x80x83(1)
Wiener Filter: H(k,i)=max(b, (1xe2x88x92xcex1xc2x7NIR))xe2x80x83xe2x80x83(2)
Magnitude Subtraction: H(k,i)=max(b, (1xe2x88x92xcex1xc2x7{square root over (NIR)}))xe2x80x83xe2x80x83(3)
k and i designate the discrete time and the discrete frequency. NIR is the noise-input ratio.
NIR=E[N(i)2]/(S(k,i)+N(k,i))2xe2x80x83xe2x80x83(4)
S and N designate the speech signal or the interference, respectively; a is an overestimation factor by which the noise can be overestimated, and b is the xe2x80x9cspectral floorxe2x80x9d which represents the minimum of the filtering function. Here, it is assumed that the speech pauses can be detected sufficiently accurately. Consequently, it is possible to calculate estimation value E[N(i)2] and, from that, NIR. Simple standard methods use a value 1 less than =a less than 4 and 0.1 less than b less than 0.3 for reducing the remaining residual noise, the so-called xe2x80x9cmusical tonesxe2x80x9d. A disadvantage in doing this, however, is always an undesired but inevitable compromise between residual noise suppression and speech distortion. A suppression of the xe2x80x98musical tonesxe2x80x99 which is markedly improved compared to the method depicted in to Linhard, supra, is proposed in Ephraim, Malah, xe2x80x9cSpeech Enhancement using a Minimum Mean-Square Error Short-Time Spectral Amplitude Estimatorxe2x80x9d; IEEE Trans. Acoust. Speech a. Signal Processing, Vol. ASSP-32, No. 6, p. 1109-1121, 1984, which is hereby incorporated by reference herein. There, information on an a priori (earlier) and an a posteriori (later) signal-to-noise ratio is utilized for modifying the filter curves, here Bessel functions. A priori and a posteriori signal-to-noise ratios Rprio and Rpost are here calculated as
X(k,i)=S(k,i))+N(k,i)xe2x80x83xe2x80x83(5)
Rpost(k,i)=|X(k,i)|2/E[N(i)2]xe2x88x921xe2x80x83xe2x80x83(6)
Rprio(k,i)=(1xe2x88x92d)P[Rpost(k,i)]+d H(kxe2x88x921,i)X(kxe2x88x921,i)|2/E[N(i)2]xe2x80x83xe2x80x83(7)
Where d is a smoothing constant, and 0.99 less than d less than 1.P[ ] is a projection by which negative components are set to zero. By selecting d close to value one, the transient oscillation into a beginning, high-energy speech signal is slowed down. Projection P results in a smoothing out of the residual noise during speech pauses. However, this is not required for preventing musical tones, and may have an unnatural effect. Moreover, the outlay required for implementing this method is considerable and, in the case of speech signals, an audible reverberation characteristic may occur. The reverberation characteristic ensues from the fact that H(kxe2x88x921,i) und X(kxe2x88x921,i) enter into the current filter curve from previous segment kxe2x88x921 via Rprio at instant k.
Therefore, an object of the present invention is to provide a method which, on one hand, allows interferences in acoustic signals, particularly in speech signals to be markedly reduced using the adaptive filtering method of spectral subtraction without causing an essential corruption of the signal such as reverberation, and which, on the other hand, allows the computational requirement to be considerably reduced relative to already known and, with regard to the quality of the achieved signal improvement, comparable methods.
The present invention provides method for reducing interference in acoustic signals by using an adaptive filtering method involving spectral subtraction, in which achieved according to the present invention in that the calculation of an, in each case current characteristic value H(k,i) of the used filtering function considering information on an a priori signal-to-noise ratio is carried out in such a manner that characteristic values H(kxe2x88x92j,i), j=1, . . . , N of the filtering function from preceding time segments kxe2x88x92j are used as the sole information on the a priori signal-to-noise ratio, however, at least one characteristic value H(kxe2x88x92j0,i), j0xcex51, . . . , N of the filtering function from a preceding time segment kxe2x88x92j0 is used; and that the characteristic curve of the filtering function is split into two parts and has a break edge such
that the filtering for heavily disturbed signals X(k,i) having a high noise-input ratio NIR(k,i) results in a signal-independent strong damping; and
that the filtering for slightly disturbed signals X(k,i) having a low noise-input ratio NIR(k,i) results in a signal-dependent low damping.
The advantages of such an embodiment are that, first of all, the acoustic quality of the noise-suppressed signal is improved to a greater extent than in the method described under Ephraim, supra, namely by feeding back one or a plurality of characteristic values H(kxe2x88x92j,i) alone for considering information preceding in time in contrast to the feeding back of characteristic value H(kxe2x88x921,i) and disturbed signal X(kxe2x88x921,i) proposed in Ephraim, supra; and, by decoupling or decorrelating H and X by considering H(kxe2x88x92j,i) and X(k, i) at different instants kxe2x88x92j and k according to the present invention, as a result of which reverberation and echos are minimized; and in that, during time segments having a high noise-input ratio NIR(k,i), for example, background noises during speech pauses, the signals are damped only independently of the signal but reproduced naturally whereas in Ephraim, supra, they are smoothed and corrupted in a manner that they are unnatural; and in that the transient oscillation of the characteristic curve into a beginning signal takes place markedly faster than in Ephraim, supra, where the transient oscillation is strongly slowed down by introducing smoothing constant d and setting its value close to 1; and that, secondly, the computational requirement is considerably smaller than in the method described in Ephraim supra because, in comparison Ephraim, supra, the calculation of the a posteriori signal-to-noise ratio is dropped, and because the consideration of the a priori signal-to-noise ratio is considerably simplified by dropping the smoothing and the projection; and because during time segments in which the signals have a high a high noise-input ratio NIR(k,i), no signal-dependent filter curve value is calculated at all, but simply a fixing to a signal-independent value is carried out.
In an advantageous embodiment of the present invention regarding the method for reducing interference in acoustic signals by means of an adaptive filtering method involving spectral subtraction, characteristic value H(kxe2x88x921,i) of the filtering function from immediately preceding time segment kxe2x88x921 is used as the sole information on the a priori signal-to-noise ratio.
Advantages of this embodiment include that it already allows a high-quality reduction of interferences to be achieved, and that the computational requirement for carrying out the method is minimal.
In a further advantageous embodiment of the present invention regarding the method for reducing interference in acoustic signals by means of an adaptive filtering method involving spectral subtraction, current characteristic value H(k,i) of the filtering function is calculated from signal-dependent noise-input ratio NIR(k,i), and the information on the a priori signal-to-noise ratio is considered in such a manner that noise-input ratio NIR(k,i) is replaced with a corrected noise-input ratio                                           NIR            xe2x80x2                    ⁡                      (                          k              ,              i                        )                          :=                              NIR            ⁡                          (                              k                ,                i                            )                                /                                    ∑                              j                =                1                            N                        ⁢                          xe2x80x83                        ⁢                                          w                j                            ⁢                              H                ⁡                                  (                                                            k                      -                      j                                        ,                    i                                    )                                                                                        (        8        )            
prior to calculating current characteristic value H(k,i), weighting factors wj being real numbers smaller than 1, and N being a natural number greater than or equal to 1.
The advantages of this embodiment are that it allows a high-quality reduction of interferences to be achieved, and that the computational requirement for carrying out the method is very small.
In a further advantageous embodiment of the present invention regarding the method for reducing interference in acoustic signals by means of an adaptive filtering method involving spectral subtraction,
H(k,i)=max(b, {square root over (1xe2x88x92xcex1xc2x7NIRxe2x80x2(k,i))}), orxe2x80x83xe2x80x83(9)
H(k,i)=max(b, (1xe2x88x92xcex1xc2x7NIRxe2x80x2(k,i)) ), orxe2x80x83xe2x80x83(10)
H(k,i)=max(b, (1xe2x88x92xcex1xc2x7{square root over (NIRxe2x80x2(k,i))}))xe2x80x83xe2x80x83(11)
are used as filtering function;
a and b being positive real numbers,
a preferably being an element of the interval from 1 to 4
b preferably being an element of the interval from 0.1 to 0.3
Advantages of this embodiment include that it allows a high-quality reduction of interferences to be achieved, and that the computational requirement for carrying out the method is considerably less than, for example, when using the Bessel functions proposed in Ephraim, supra. Above all, when reducing interferences of speech signals, it has turned out to be beneficial to select parameters a and b preferably from the mentioned intervals.
In a further advantageous embodiment of the present invention regarding the method for reducing interference in acoustic signals by means of an adaptive filtering method involving spectral subtraction, the position of the break edge of the filter curve is adapted to the disturbed signal, preferably in such a manner that the position of the break edge during the filtering of signals having a high frequency differs from the position of the break edge during the filtering of signals having a lower frequency and/or that the position of the break edge during the filtering of speech signals differs from the position of the break edge during the filtering of speech pauses.
In the case of speech signals, the higher frequencies have on average less energy than the lower frequencies. However, the higher frequencies play an important part in the understandability of speech. By the selection of the position of the break edge, it is possible for higher frequencies to be given preference, for example, to be damped to a lower degree, which contributes to the improvement of the subjective quality of speech.
In a further advantageous embodiment of the present invention regarding the method for reducing interference in acoustic signals by means of an adaptive filtering method involving spectral subtraction, the position of the break edge of the filter curve is adapted to the disturbed signal
in such a manner that noise-input ratio NIR(K,i) is replaced with a corrected noise-input ratio                                           NIR            xe2x80x2                    ⁡                      (                          k              ,              i                        )                          :=                              NIR            ⁡                          (                              k                ,                i                            )                                /                      [                                          c                ⁡                                  (                  i                  )                                            +                                                (                                      1                    -                                          c                      ⁡                                              (                        i                        )                                                                              )                                ⁢                                                      ∑                                          j                      =                      1                                        N                                    ⁢                                      xe2x80x83                                    ⁢                                                            w                      j                                        ⁢                                          H                      ⁡                                              (                                                                              k                            -                            j                                                    ,                          i                                                )                                                                                                                  ]                                              (        12        )            
prior to calculating current characteristic value H(k,i), weighting factors wj being real numbers smaller than 1, and N being a natural number greater than or equal to 1;
preferably in such a manner that noise-input ratio NIR(K,i) is replaced with a corrected noise-input ratio
xe2x80x83NIRxe2x80x2(k,i):=NIR(k,i)/[c(i)+(1xe2x88x92c(i))H(kxe2x88x921,i)]xe2x80x83xe2x80x83(13)
prior to calculating the current characteristic value H(k,i).
Advantages of this embodiment include that it allows the above-mentioned displacement of the position of the break edge to be attained in a simple manner, in particular in the secondly-mentioned preferred embodiment.
In a further advantageous embodiment of the present invention regarding the method for reducing interference in acoustic signals by means of an adaptive filtering method involving spectral subtraction, characteristic filter value or values H(kxe2x88x92j,i) from preceding time segments kxe2x88x92j required for calculating current corrected noise-input ratio NIRxe2x80x2(k,i) are initially corrected themselves in the form
Hxe2x80x2(kxe2x88x92j,i):=fjH(kxe2x88x92j,i)ej, Fj and ej real numbersxe2x80x83xe2x80x83(14)
prior to calculating noise input-ratio NIRxe2x80x2(k,i).
Speech quality is a subjective concept which can be given attributes such as naturalness, freedom of distortion, freedom of noise, low-fatigue listening, etc. A disturbing noise can have very differing time and/or spectral characteristics, depending on its type. A parametrization according to equation (14), via additional degrees of freedom or parameters e and f, makes it possible for the feedback mechanism to be influenced, thus allowing the subjective quality of speech and the residual interferences to be changed.
The method for reducing interference in acoustic signals by means of an adaptive filtering method involving spectral subtraction turns out to be particularly advantageous in the above-mentioned specific embodiments when used for reducing interferences in speech signals.