Speech recognition systems that employ Hidden Markov Modeling (HMM) are well known in the art. HMM models are generally trained for some training database of speech, which fixes the parameters of the HMMs based on statistical information within the training data. During recognition, the HMM models do not change depending on the characteristics of the incoming speech. Thus, while the HMM models may be able to explain the typical expected environment, they may not be able to describe the encountered environment well.
Differences between the expected and the encountered environment may result from variabilities encountered in the speech signal. Such variabilities may be caused by any combination of background noise or interference, channel or handset noise, filtering characteristics, and even effects due to speaker differences such as dialect.
To compensate for the variabilities encountered in a speech signal, many speech recognition systems rely on a mixture of HMM models where each state of the model has a corresponding probability distribution defined by a mixture of a large number of distributions to count for the variabilities. These models have a large number of parameters and become unwieldy when attempting to use them in many practical real-time applications.