In the last couple of years, there has been much interest in the two areas of feature-space transformation and model-space transformation based adaptation in order to reduce the speech recognition errors caused by acoustic mismatches between the training and testing conditions.
Research and experiments have shown that there can be some improvements by employing the model-space approach instead of the feature-space approach. One model-space approach using a formulation of trended HMM (also known as trajectory-based HMM or nonstationary-state HMM), see L. Deng et al. "Speech Recognition using hidden Markov models with polynomial regression functions as nonstationary states" IEEE Transactions on Speech and Audio Processing, Vol. 2, No. 4, pp 507-520, 1994, has been successfully used in automatic speech recognition applications for the past few years. More recently, a minimum classification error training (MCE) procedure has been developed for trended HMM to improve the discriminating ability of maximum-likelihood (ML) criterion, see R. Chengalvarayan and L. Deng, "The trended HMM with discriminative training for phonetic classification", Proceedings ICSLP, Vol 2, pp. 1049-1052, 1996. This MCE training approach aims at directly minimizing the recognition error rate of the training data by taking into account other competing models and has recently been used in speaker adaptation applications.
The above presented model space approaches in trended HMM have proven to be advantageous if applied to speech recognition alone. However, since speech recognition has not reached perfection yet, there are still other advantages to pursue and there is still room to improve.