1. Field of the Invention
The present invention relates to speech recognition and more specifically to performing speech recognition when portions of speech are missing.
2. Introduction
Speech recognition systems must operate with user input from more and more locations, such as cellular phones and Voice over IP (VoIP) phones. Communication networks for such systems are typically packet-switched, meaning that occasionally packets representing portions of speech go missing. These short, missing segments of speech hinder the accuracy of speech recognition engines because they assume all the speech is present. One method currently known in the art to handle missing portions of speech is to invent, generate, or extrapolate data based on the non-missing, adjacent segments of speech. This approach is flawed because a speech recognition engine can misrecognize certain words if the wrong speech segment is missing or if multiple speech segments in close proximity are missing. For example, a traditional speech recognition engine can determine the missing segment in “unnecess?ry”. The same speech recognition engine can encounter difficulty when determining the missing segments in “inter?ontine?tal”. A speech recognition engine may recognize “inter?ontine?tal” as “enter on tin metal” or “enter on tin dental”. Another method currently known in the art is to ignore missing portions of speech as if they never existed. This approach is flawed because missing syllables and phonemes can lead to worse recognition results. In both approaches, the original problem of missing speech segments is compounded by the user confusion in future utterances based on the initial poor recognition results. Accordingly, what is needed in the art is an improved way to recognize and/or synthesize speech with missing segments.