Discriminative training has been a prominent theme in recent speech recognition research and system development. The essence of discriminative training algorithms (for example, minimum classification error (MCE) training algorithms) is the adoption of various cost functions that are directly or indirectly related to the empirical error rate found in the training data. These cost functions serve as objective functions for optimization, and for a related empirical error rate that may either be calculated at a sentence string level, at a super-string level, or at a sub-string level, e.g., at an word/phone token level.
For example, one approach that has been found during research is that when the empirical training error rate is optimized through the use of a classifier or recognizer, only a biased estimate of the true error rate is obtained. The size of this bias depends on the complexity of the recognizer and the task (as quantified by the Vapnik Chervonenkis (VC) dimension). Analysis and experimental results have shown that this bias can be quite substantial even for a simple Hidden Markov Model (HMM) recognizer applied to a simple single digit recognition task. Another key insight from the machine learning research suggests that one effective way to reduce this bias and improve generalization performance is to increase “margins” in training data. That is, making correct samples be classified well away from a decision boundary. Thus, it is desirable to use such large margins for achieving lower test errors even if this may result in higher empirical errors in training. Most previous approaches to discriminative learning techniques and speech recognition have focused on the issue of empirical error rate. Recently, one approach, which has focused on the issue of margins, has shown some positive results when utilized for small automatic speech recognition tasks. However, similar success has not been demonstrated in connection with large-scale speech recognition.
The discussion above is merely provided for general background information and is not intended to be used as an aid in determining the scope of the claimed subject matter.