The current discussion is directed toward speech recognition. More particularly, the current discussion is directed toward compensating for the effects of additive and convolutive distortions in speech recognition systems.
Building high performance speech recognition systems that are robust to environmental conditions is an ongoing challenge. One of the issues that affect the robustness of speech recognition systems is the existence of many types of distortions, including additive and convolutive distortions and their mixes, which are difficult to be predicted during the development of speech recognizers. As a result, the speech recognition systems usually are trained using clean speech and often suffer significant degradation of performance when used under noisy environments unless compensation is applied.
Different compensation methodologies have been proposed in the past to achieve environmental robustness in speech recognition. In one methodology, distorted speech features are enhanced with advanced signal processing methods. Examples of such processing methods include the European Telecommunications Standards Institute (ETSI) advanced front end (AFE) and stereo-based piecewise linear compensation for environments (SPLICE). In another approach, a speech recognizer operates on its model to adapt or adjust the model parameters to better match the speech recognition system with the actual, distorted environment. Examples of a model-based approach include parallel model combination (PMC) and joint compensation of additive and convolutive distortions (JAC). With an expectation-maximization (EM) method, JAC directly estimates the noise and channel distortion parameters in the log-spectral domain, adjusts the acoustic HMM parameters in the same log-spectral domain, and then converts the parameters to the cepstral domain. However, no strategy for HMM variance adaptation has been given and the techniques for estimating the distortion parameters involve a number of unnecessary approximations in the JAC approaches.
The discussion above is merely provided for general background information and is not intended to be used as an aid in determining the scope of the claimed subject matter.