1. Field of the Invention
The present invention generally relates to systems and methods for improving the perceptual quality of audio signals, such as speech signals transmitted between audio terminals in a telephony system.
2. Background
In a telephony system, an audio signal representing the voice of a speaker (also referred to as a speech signal) may be corrupted by acoustic noise present in the environment surrounding the speaker as well as by certain system-introduced noise, such as noise introduced by quantization and channel interference. If no attempt is made to mitigate the impact of the noise, the corruption of the speech signal will result in a degradation of the perceived quality and intelligibility of the speech signal when played back to a far-end listener. The corruption of the speech signal may also adversely impact the performance of speech processing algorithms used by the telephony system, such as speech coding and recognition algorithms.
Mobile audio terminals, such as Bluetooth™ headsets and cellular telephone handsets, are often used in outdoor environments that expose such terminals to a variety of noise sources including wind-induced noise on the microphones embedded in the audio terminals (referred to generally herein as “wind noise”). As described by Bradley et al. in “The Mechanisms Creating Wind Noise in Microphones,” Audio Engineering Society (AES) 114th Convention, Amsterdam, the Netherlands, Mar. 22-25, 2003, pp. 1-9, wind-induced noise on a microphone has been shown to consist of two components: (1) flow turbulence that includes vortices and fluctuations occurring naturally in the wind and (2) turbulence generated by the interaction of the wind and the microphone.
As also discussed by Bradley et al. in the aforementioned paper, the effect of wind noise is a more significant problem for handheld devices with embedded microphones, such as handheld cellular telephones, than for free-standing microphones. This is due, in part, to the fact that these handheld devices are larger than free-standing microphones such that the interaction with the wind is likely to be more important. This is also due, in part, to the fact that the proximity of a human hand, arm or head to such handheld devices may generate additional turbulence. This latter fact is also an issue for headsets used in telephony systems.
Generally speaking, wind noise is bursty in nature with gusts lasting from a few to a few hundred milliseconds. Because wind noise is impulsive and has a high amplitude that may exceed the nominal amplitude of a speech signal, the presence of such noise will degrade the perceptual quality and intelligibility of a speech signal in a manner that may annoy a far end listener and lead to listener fatigue. Furthermore, because wind noise is non-stationary in nature, it is typically not attenuated by algorithms conventionally used in telephony systems to reduce or suppress acoustic noise or system-introduced noise. Consequently, special methods for detecting and suppressing wind noise are required.
Currently, the most effective schemes for reducing wind noise are those that use two or more microphones. Because the propagation speed of wind is much slower than that of acoustic sound waves, wind noise can be detected by correlating signals received by the multiple microphones. In contrast, noise suppression algorithms that must rely on only a single microphone often confuse wind noise with speech. This is due, in part, to the fact that wind noise has a high energy relative to background noise, and thus presents a high signal-to-noise ratio (SNR). This is also due, in part, to the fact that wind noise is non-stationary and has a short duration in time, and thus resembles short speech segments.
Some wind noise reduction schemes do exist for audio devices having only a single microphone. For example, it is known that a fixed high-pass filter can be used to remove some portion of the low-frequency wind noise at all times. As another example, Published U.S. Patent Application No. 2007/0030989 to Kates, entitled “Hearing Aid with Suppression of Wind Noise” and filed on Aug. 1, 2006, describes a simple detector/attenuator that makes use of a single spectral characteristic of an audio signal—namely, the ratio of the low frequency energy of the audio signal to the total energy of the audio signal—to detect wind noise. However, these simple approaches are only effective for suppressing wind noise due to very low speed wind and are generally ineffective at suppressing wind noise due to moderate to high speed wind.
Wind noise reduction methods for single microphones also exist that are based on advanced digital signal processing (DSP) methods. For example, one such method is described by Schmidt et al. in “Wind Noise Reduction Using Non-Negative Sparse Coding,” IEEE International Workshop on Machine Learning for Signal Processing, 2007. However, these methods are extremely complex computationally and at this stage not mature enough to be deemed effective.
What is needed, then, is a technique for effectively detecting and reducing non-stationary noise, such as wind noise, present in an audio signal received or recorded by a single microphone. When the audio signal is a speech signal received by a handset, headset, or other type of audio terminal in a telephony system, the desired technique should improve the perceived quality and intelligibility of the speech signal corrupted by the non-stationary noise. The desired technique should be effective at suppressing non-stationary noise due to low, moderate and high speed wind. The desired technique should also be of reasonable computational complexity, such that it can be efficiently and inexpensively integrated into a variety of audio device types.