This invention relates to processing and understanding natural language input from a device such as but not limited to a speech recognition system.
Speech or voice recognition systems are well known and are becoming more widely used in computer-based user interface systems, such as voice-activated dialing and telephone systems, and speech-based handicap aids, etc. Such systems might someday replace the now-ubiquitous menu driven telephone answering systems by which a telephone call is routed to a particular person or department for example, by a list of choices presented to a caller via recorded message.
Speech recognition systems that are in use today typically use statistical models to store representations of words and try to map a speech input word to a stored representation of a word in its vocabulary, which consists of all of the stored models of words. Well-known spectral analysis techniques are typically used to map the spectral components of an input word to the spectral components of stored representations of words.
Most speech recognition systems typically require that recognizable words be spoken one at a time or a phrase at a time. Continuous-speech based recognition systems tend to have a lower level of accuracy compared to phrase recognition systems or word recognition systems. As the number of words in a basic vocabulary set increases, the chance of an error in the recognition process increases as well. Voice recognition systems in use today with moderately large vocabularies can map spoken words to text with an accuracy of around 85%. The success rate for spoken numbers is approximately 98%. The ultimate aim of any speech recognition system, however, is to map a given spoken input into correct text which can be used to generate an appropriate response to the spoken input.
Another problem with prior art speech recognition systems is that while they may be able to recognize a word or limited-length phrases, they are generally not able to understand words unless they are spoken in a particular sequence. The aforementioned menu-driven automated telephone answering systems present very limited choices to a caller. A speech recognition system that is capable of recognizing and understanding recognizable words, regardless of their order of use in a spoken utterance would be an improvement over the prior art. Such a system would be able to more closely approximate the recognition of natural languages such as the human recognition of speech, namely that verbal expression can be understood even if the expressive words are used in different sequences. A method and apparatus to understand recognized speech would provide the ability to more efficiently communicate with computer controlled communications equipment.
Accordingly, an object of the present invention is to provide a computer-based method and apparatus for understanding natural language input from a user, regardless of the sequence in which words are uttered. This results in a speed up achieved by loosening the rigid constraints of rigorous input language and by better handling input errors.