Interactive Voice Response (IVR) systems enable a user to interact with various applications and/or systems using a combination of voice and touch-tone responses. In general, an IVR system can include a speech recognition system, a text-to-speech system, and a speech application. The speech application generally dictates the order and circumstances in which dialogs are presented to the user. The complexity of modern speech applications has led to the development of reusable software components. Reusable software components facilitate the development of speech applications by shielding developers from the intricacies associated with building a robust speech dialogue, e.g., confidence score interpretation, error recovery mechanisms, prompting, and the like.
One type of reusable software component for use in constructing a speech application is defined by the Reusable Dialog Component (RDC) framework. The RDC framework specifies how Java Server Page taglibs that aid in rapid development of voice and multimodal applications can be created. An RDC is composed of a data model, speech-specific assets like grammars and prompts, configuration files, and the dialog logic needed to collect one or more items of information from a user. The voice user interface can be implemented using a voice markup language such as Voice Extensible Markup Language (VoiceXML) which is generated by the RDC. Speech applications can be written by instantiating one or more RDCs. The runtime behavior of the RDCs can be regulated by specifying various tuning parameters and configuration files. Through the RDC tuning parameters and configuration files, for example, one can customize the RDC with respect to vocabulary, retry settings, specify application-specific prompts, and the like.
RDCs encapsulate well-tried elements of speech user interface design. An RDC, for example, can collect information such as an address from the user. In doing so, the RDC ensures that all the required interactions for guaranteeing the completeness, such as validity and canonicalization format, of the data are provided. An address RDC, for example, would provide the error handling and logic needed for obtaining all aspects of a user address such as the street address, apartment number, city, state, and zip code. Each item of information that is collected by the RDC fills in a field of the RDC. Thus, an address RDC would have multiple fields in which the different data items comprising the address would be filled. In any event, when writing another speech application that must receive a user address, the address RDC simply can be incorporated into that application rather than coding a solution for capturing a user address from scratch.
As noted, RDC components can be tuned prior to being deployed as part of a speech application. The tunable parameters for each RDC allow the RDC to behave very differently according to the particular environment in which the speech application will be used. Accordingly, it is necessary to tune these parameters so that the speech application will function in an acceptable manner when placed in a given environment.
Presently, speech applications are tuned by deploying the IVR system and speech application in a pilot phase where data is collected in a log over a period of days or weeks. The log is manually reviewed using various software-based analysis tools. From this review, one or more values for the different tunable parameters of the RDCs used in the speech application can be determined. The speech application can be deployed again with the RDCs being updated to include the newly determined values for the different tunable parameters. This process is often repeated until such time that the speech application performs in an acceptable manner. In certain circumstances, the above process may have to be repeated even when an application that had been functioning properly begins to experience degraded performance due to changes relating to demography of callers, hardware changes, etc.
The manual and repetitive nature of the tuning process is cumbersome and labor intensive, often requiring significant time to properly tune or adjust the speech application. It would be beneficial to provide a technique for tuning reusable software components that addresses the limitations described above.