Improving the accuracy and performance of automatic speech recognition has been an active area of research. These have included modeling human speech using methods such as hidden Markov models and hidden trajectory models. For example, a statistical hidden trajectory model may use temporal filtering of hidden vocal tract resonance targets to estimate a hidden trajectory for a vocal tract resonance. The targets used in the hidden trajectory model are described as being stochastic with a phoneme-dependent probability distribution. Thus each phoneme has a mean target and a target variance. In the past, the mean target and the target variance have been determined using a vocal tract resonance tracker.
Using the tracker, hidden trajectory values for individual phonemes are collected and the statistical distribution of the vocal tract resonances is used to identify the means and variances for the targets. The vocal tract resonance tracker is prone to errors in the vocal tract resonances identified by the tracker, which are propagated into the target distributions. As a result, the target distributions are incorrect, sometimes resulting in undesirable performance of the hidden trajectory model. Also, the acoustic features such as cepstra used in the past are static, losing some useful information that can be provided by the “dynamic” or temporal-differential features.
The discussion above is merely provided for general background information and is not intended to be used as an aid in determining the scope of the claimed subject matter.