Various online services utilize computer applications to perform automatic speech recognition (“ASR”) in completing various voice-activated functions initiated from a user's computer, such as the processing of information queries. However, the accuracy of the ASR performed is limited by the robustness of the environment in which the speech recognition is taking place. For example, ambient noise caused by background speakers or automobiles may interfere with or distort user commands spoken in a microphone for transmission to the online service.
Previous solutions for addressing distortions in ASR have been directed to a model-domain approach that jointly compensates for additive and convolutive distortions (“JAC”) in speech. In these previous solutions, a computer-based algorithm utilizes a parsimonious nonlinear physical model to describe the environmental distortion and further uses a vector Taylor series (“VTS”) approximation technique to find closed-form hidden Markov model (“HMM”) adaptation and noise/channel parameter estimation formulas to compensate for speech distortions. A drawback associated with the JAC-VTS model adaptation technique however, is that the same approximated linear mapping between clean and distorted speech model parameters is shared across the entire model space even though the true mapping is nonlinear. It is with respect to these considerations and others that the various embodiments of the present invention have been made.