Attenuation of noise in audio signals is desirable in many applications to further enhance or emphasize a desired signal component. For example, enhancement of speech in the presence of background noise has attracted much interest due to its practical relevance. A particularly challenging application is single-microphone noise reduction in mobile telephony. The low cost of a single-microphone device makes it attractive in the emerging markets. On the other hand, the absence of multiple microphones precludes beam former-based solutions to suppress the high levels of noise that may be present. A single-microphone approach that works well under non-stationary conditions is thus commercially desirable.
Single-microphone noise attenuation algorithms are also relevant in multi-microphone applications where audio beam-forming is not practical or preferred, or in addition to such beam-forming. For example, such algorithms may be useful for hands-free audio and video conferencing systems in reverberant and diffuse non-stationary noise fields or where there are a number of interfering sources present. Spatial filtering techniques such as beam-forming can only achieve limited success in such scenarios and additional noise suppression needs to be performed on the output of the beam-former in a post-processing step.
Various noise attenuation algorithms have been proposed including systems which are based on knowledge or assumptions about the characteristics of the desired signal component. In particular, knowledge-based speech enhancement methods such as codebook-driven schemes have been shown to perform well under non-stationary noise conditions, even when operating on a single microphone signal. Examples of such methods are presented in: S. Srinivasan, J. Samuelsson, and W. B. Kleijn, “Codebook driven short-term predictor parameter estimation for speech enhancement”, IEEE Trans. Speech, Audio and Language Processing, vol. 14, no. 1, pp. 163 {176, January 2006 and S. Srinivasan, J. Samuelsson, and W. B. Kleijn, “Codebook based Bayesian speech enhancement for non-stationary environments,” IEEE Trans. Speech Audio Processing, vol. 15, no. 2, pp. 441-452, February 2007.
These methods rely on trained codebooks of speech and noise spectral shapes which parameterized by e.g., linear predictive (LP) coefficients. The use of a speech codebook is intuitive and lends itself readily to a practical implementation. The speech codebook can either be speaker independent (trained using data from several speakers) or speaker dependent. The latter case is useful for e.g. mobile phone applications as these tend to be personal and often predominantly used by a single speaker. The use of noise codebooks in a practical implementation however is challenging due to the variety of noise types that may be encountered in practice. As a result a very large noise codebook is typically used.
Typically, such codebook based algorithms seek to find the speech codebook entry and noise codebook entry that when combined most closely matches the captured signal. When the appropriate codebook entries have been found, the algorithms compensate the received signal based on the codebook entries. However, in order to identify the appropriate codebook entries a search is performed over all possible combinations of the speech codebook entries and the noise codebook entries. This results in computationally very resource demanding process that is often not practical for especially low complexity devices. Furthermore, the large noise codebooks are cumbersome to generate and store, and the large number of possible noise candidates may increase the risk of an erroneous estimate resulting in a suboptimal noise attenuation.
Hence, an improved noise attenuation approach would be advantageous and in particular an approach allowing increased flexibility, reduced computational requirements, facilitated implementation and/or operation, reduced cost and/or improved performance would be advantageous.