Advances in information storage capacity and CPU processing power have provided for enhanced performance of speech recognition systems. The utilization of these systems to provide automated information services such as automated directory assistance (“DA”) allow significant cost savings by increasing the number of calls that can be handled while simultaneously reducing the need for human operators. Automated speech information services may be provided over existing networks such as the PSTN (Public Switched Telephone Network).
Typically these services are provided at a network node that combines an IVR (Interactive Voice Response) system on the front-end with a speech recognition engine at the back-end. Directory assistance data typically includes entities and a set of associated information such as phone numbers, addresses, etc. for these entities. A user provides an input reference, for example, in the form of a spoken utterance or a text string to refer to a particular entity for which associated information is sought to a directory assistance service. The directory assistance service returns the requested associated information based upon a determination of the referred entity, which is determined as function of the input reference.
A critical step in the performance of these automated information services is the configuration of the underlying speech recognition engine to insure the highest recognition rates and most robust performance. Typically speech recognition engines utilize a context free grammar (“CFG”) or a SLM (“Statistical Language Modeling”) approach for performing recognition.
A significant technical challenge in the implementation of these systems is generating an appropriate grammar from a raw information source. Thus, there exists a need for a method and system for generation of robust grammars.