1. Field of the Invention
The present invention relates to a noise reduction method and, more particularly, to a method using spectral subtraction to reduce noise.
2. Description of Related Art
The spectral subtraction method has been proven effective in enhancing speech degraded by additive noise. It is simple to implement, hence is suitable as the pre-processing scheme for speech coding and recognition applications. This method subtracts the noise spectrum estimate from the noisy speech spectrum to estimate the speech magnitude spectrum, so as to obtain the clean speech signals.
FIG. 1 shows the flowchart of the aforementioned spectral subtraction method, wherein the input noisy speech is divided into a plurality of continuous frames, and each frame is represented by an additive noise model:yr(k)=sr(k)+wr(k),where yr(k), sr(k) and wr(k) denote respectively the k-th noisy speech, clean speech, and noise sample of the r-th frame. Taking the fast Fourier transform of the noisy speech frame yr(k) (step S101), the noisy speech spectrum of the r-th frame at the k-th frequency component is obtained and denoted as |Yr(k)|2. In addition, the noisy speech yr(k) is also applied in a silence detection process (step S102) and a noise spectrum estimation process (step S103) to estimate a noise spectrum, denoted as |Wr(k)|2. After performing a spectral subtraction process (step S104), the energy spectrum of clean speech is obtained as follows:|Ŝr(k)|2=|Yr(k)|2−|Wr(k)|2.  (1)
If the phase spectrum of the clean speech can be approximated by the phase spectrum of the noisy speech, the estimate of clean speech ŝr(k) can be obtained by taking the inverse fast Fourier transform of |Ŝr(k)|2.
Such a method is suitable as the pre-processing scheme for speech coding and recognition applications because it is easy, effective and simple to implement. However, the noise spectrum estimate may cause a relatively large spectral excursion in the spectrum estimate of clean speech. This spectral excursion will be perceived as time varying tones contributing to the so-called musical noise.
To reduce the musical noise Berouti et al proposed a noise reduction method to over-subtract the noise spectrum estimate, and a description of such can be found in M. Berouti, R. Schwartz, and J. Makhoul “Enhancement of speech corrupted by acoustic noise”, pp. 208–211, 1979 IEEE, which is incorporated herein for reference, wherein the formula (1) is modified as:|Ŝr(k)|2=|Yr(k)|2−αr·|Wr(k)|2. αr≧1,  (2)so as to decrease the influence caused by the excursion of the noise spectrum estimate and thus reduce the effect of musical noise. In the method, the over-subtraction factor αr was determined by the signal-to-noise ratio (SNR) of the processing frame, and can be expressed by formula:
                                          α            r                    =                                    α              0                        +                                          SNR                r                            ·                                                1                  -                                      α                    0                                                                    SNR                  1                                                                    ,                            (        3        )            where α0 is pre-selected over-subtraction factor when SNR=0, SNR1 is pre-selected SNR value when αr=1, SNRr is the estimate of signal-to-noise ratio of the processed r-th frame. Based on the formula (3), it is known that αr is inversely proportional to SNRr. The smaller the SNRr is, the larger the αr is, and a larger αr is helpful in removing the larger noise spectrum excursion.
Examining human speech spectrum, it is known that the speech energy distributes non-uniformly and often concentrates on lower frequency components. Hence SNR differs with frequencies and often have larger values at lower frequency components. From the formula (3), it is known that more suppression is needed for lower SNR and vise versa. High-frequency components thus need more suppression to avoid musical noise, while low-frequency components need less suppression to prevent speech distortion. However, for the over-subtraction method based on formulas (2) and (3), it faces the problem of too much over-subtraction and hence speech distortion at low-frequency components while too less over-subtraction and hence musical noise at high-frequency components. Accordingly, improved schemes are proposed to avoid such a problem, and one of the schemes can be found in Kuo-Guan Wu and Po-Cheng Chen “Efficient speech enhancement using spectral subtraction for car hands-free application”. 2001 Digest of technical papers, pp. 220–221, which is incorporated herein for reference. However, it is unable to completely eliminate the problem. Therefore, there is a need for the above conventional noise reduction method to be improved.