1. Field of the Invention
The present invention is generally in the field of speech coding. In particular, the present invention is related to noise suppression.
2. Background Art
Noise reduction has become the subject of many research projects in various technical fields. In the recent years, due to the tremendous demand and growth in the areas of digital telephony using the Internet and cellular telephones, there has been an intense focus on the quality of audio signals, especially reduction of noise in speech signals. The goal of an ideal noise suppressor system or method is to reduce the noise level without distorting the speech signal, and in effect, reduce the stress on the listener and increase intelligibility of the speech signal.
Common existing methods of noise suppression are based on spectral subtraction techniques, which are performed in the frequency domain using well-known Fourier transform algorithms. The Fourier transform provides transformation from the time domain to the frequency domain, while the inverse Fourier transform provides a transformation from the frequency domain back to the time domain. Although spectral subtraction is commonly used due to its relative simplicity and ease of implementation, complex operations are still required. In addition, the overlap and add operations, which are used in the spectral subtraction techniques, often cause undesireable delays.
FIG. 1 illustrates an overview of a traditional spectral subtraction process, wherein operations to the left of dashed line 105 are performed in the time domain and operations to the right of dashed line 105 are performed in the frequency domain. By way of background, an observed speech signal (or noisy speech signal) comprises a clean speech signal and an additive noise signal, wherein the additive noise signal is independent of the clean speech signal.
FIG. 1 shows observed speech signal y(n) 102, where “n” is a time index. As shown, Fourier transform module 112 receives observed speech signal y(n) 102 and computes power spectrum Py 113, as the magnitude squared of the Fourier transform. At estimate of noise spectrum module 114, estimated noise spectrum Pn 115 is approximated, typically from a window of signal in which no speech is present. Next, spectral subtraction module 116 receives and subtracts estimated noise spectrum Pn 115 from power spectrum Py 113 of observed speech signal y(n) 102 to produce an estimate of clean speech spectrum Px 117. The estimate of clean speech spectrum Px 117 is then combined with phase information 118 obtained from observed speech signal y(n) 102 to yield an estimate of the Fourier transform of a clean speech signal. Finally, inverse Fourier transform module 120 along with overlap and add module 122 construct estimated clean speech signal x(n) 124 in the time domain.
In applying the inverse Fourier transform, it is assumed that phase information 118 is not critical, such that only an estimate of the magnitude of observed speech signal y(n) 102 is required and the phase of the enhanced signal is assumed to be equal to the phase of the noisy signal. Although this approximation may work well in applications with high signal to noise ratios (SNRs), e.g. >10 dB, it can result in significant errors with low SNRs.
The spectral subtraction method of noise suppression involves complex operations in the form of Fourier transformations between the time domain and frequency domain. These transformations have been known to cause processing delays and consume a significant portion of the processing power.
Thus there is an intense need in the art for low-complexity noise suppression systems and methods that can substantially reduce the processing delay and processing power associated with the traditional noise suppression systems and methods.