Vehicles are often equipped with various types of devices that produce and receive sound energy. For example, various hands-free systems are used by vehicle occupants to control various vehicular functions through a user speaking commands into a microphone, and the commands being recognized and executed by one or more control modules at the vehicles. The users in the vehicles may also use cellular phones or other types of sound producing or receiving devices.
Noise removal or suppression is important for clear mobile voice communications or accurate automatic speech recognition. However, effectively removing ambient noise without introducing distortion to speech has long been a difficult challenge. Over the past few decades, numerous noise suppression (NS) algorithms have been developed, particularly in the category of single channel noise suppressors. Some of these algorithms are widely used in mobile phones, Bluetooth headsets, hearing aids and hands-free car kits for the purpose of enhancing speech in noisy environment.
These algorithms are sometimes capable of suppressing stationary noise contaminated to speech (e.g., with 15 dB SNR improvement under a static car engine noise condition). However, the performance degrades significantly if the ambient noise changes dynamically over time (e.g., 4 dB SNR improvement in babble noise conditions). One reason for this degradation is that most voice activity detection (VAD) approaches used in these previous algorithms have difficulties in separating speech from non-stationary noise (e.g. multi-talker babble noise). Another reason for the degradation is that the estimated noise and the noise presence are not time aligned. More specifically, noise suppression algorithms typically estimate noise when speech is absent, but freezes noise estimation when speech is present. As a consequence, the noise subtraction/attenuation during speech periods typically depend on an “out-of-date” noise estimates.
Although this asynchronous noise estimation/utilization process is sometimes acceptable when the ambient noise is stationary, it becomes over-simplistic and not suitable in canceling non-stationary noises, such as transient traffic noise, or babble noise. In these later cases, outdated information is used and noise removal is not effective or acceptable. The absence of effective noise removal produces audio qualities that are unacceptable for many users.
Skilled artisans will appreciate that elements in the figures are illustrated for simplicity and clarity and have not necessarily been drawn to scale. For example, the dimensions and/or relative positioning of some of the elements in the figures may be exaggerated relative to other elements to help to improve understanding of various embodiments of the present invention. Also, common but well-understood elements that are useful or necessary in a commercially feasible embodiment are often not depicted in order to facilitate a less obstructed view of these various embodiments of the present invention. It will further be appreciated that certain actions and/or steps may be described or depicted in a particular order of occurrence while those skilled in the art will understand that such specificity with respect to sequence is not actually required. It will also be understood that the terms and expressions used herein have the ordinary meaning as is accorded to such terms and expressions with respect to their corresponding respective areas of inquiry and study except where specific meanings have otherwise been set forth herein.