1. Field of the Invention
The invention relates to acoustic models for automatic speech recognition in general and, more particularly, to provide a method, system and computer media for recognizing the explicit distinctions between pre-vocalic and post-vocalic consonants.
2. Introduction
Automatic speech recognition (ASR) with a computer presents a difficult problem because of the complexity of the human language. An ASR system attempts to map the acoustic signals to a string of words in order to gain some type of understanding of an uttered sentence or command. The ASR system faces difficulty because humans use more than their ears when listening. Humans use knowledge about the speaker and the subject, grammar and diction, and also redundancy to predict words during a conversation.
The main goal in acoustic modeling for speech recognition is to define which of the conditions modify the acoustic realization of the phonemes the same way, and defining them as phoneme classes which share the acoustic models. However, the systematic difference between pre-vocalic and post-vocalic consonant creates variability and confusability that is not accounted for with the ASR techniques. Accordingly, what is needed in the art is an improved way to process speech taking these issues into consideration.