The present invention relates to speech recognition and natural language understanding. More specifically, the present invention relates to authoring a grammar as a language model for use in performing simultaneous speech recognition and natural language understanding.
Recognizing and understanding spoken human speech is believed to be integral to future computing environments. To date, the tasks of recognizing and understanding spoken speech have been addressed by two different systems. The first is a speech recognition system, and the second is a natural language understanding system.
Conventional speech recognition systems receive a speech signal indicative of a spoken language input. Acoustic features are identified in the speech signal and the speech signal is decoded, using both an acoustic model and a language model, to provide an output indicative of words represented by the input speech signal.
Also, in order to facilitate the development of speech enabled applications and services, semantic-based robust understanding systems are currently under development. Such systems are widely used in conversational, research systems. However, they are not particularly practical for use by conventional developers in implementing a conversational system. To a large extent, such implementations have relied on manual development of domain-specific grammars. This task is time consuming, error prone, and requires a significant amount of expertise in the domain.
In order to advance the development of speech enabled applications and services, an example-based grammar authoring tool has been introduced. The tool is known as SGStudio and is further discussed in Y. Wang and A. Acero, GRAMMAR LEARNING FOR SPOKEN LANGUAGE UNDERSTANDING, IEEE Workshop on Automatic Speech Recognition and Understanding, Madonna D. Campiglio Italy, 2001; and Y. Wang and A. Acero EVALUATION OF SPOKEN LANGUAGE GRAMMAR LEARNING IN ATIS DOMAIN, Proceedings of ICASSP, Orlando, Fla. 2002. This tool greatly eases grammar development by taking advantage of many different sources of prior information, as well as the machine learning technologies. It allows a regular developer, with little linguistic knowledge, to build a semantic grammar for spoken language understanding. The system facilitates the semi-automatic generation of relatively high quality semantic grammars, with a small amount of annotated training data. Further, the tool not only significantly reduces the effort involved in developing a grammar, but also improves the understanding accuracy across different domains. Still, improvement can be made to easily authoring different types of grammars for different application scenarios.