1. Field of the Invention
The present invention relates to the field of telecommunications and, more particularly, to a telecommunications voice server with improved barge-in capabilities.
2. Description of the Related Art
Interactive Voice Response (IVR) systems can be used to form a telephone-based link between users and computer systems. The computer systems can include one or more communicatively linked voice servers, which provide a multitude of speech services, like automatic speech recognition (ASR) services, synthetic speech generation services, transcription services, language and idiom translation services, and the like. IVR technologies permit users to provide speech input and/or provide Dual Tone Multiple Frequency (DTMF) input to automated systems. The automated system can interpret the user input and can responsively perform one or more programmatic actions. These programmatic actions can include producing synthetic speech output and presenting this output to the user via a telephone handset.
Commonly, a user will respond to a voice prompt before the entire voice prompt has been presented. For example, the voice prompt can provide a listing of options and the user can select one of these options as others are still being presented, thereby interrupting the audio prompt. The interruption of an audio prompt that is presented by an automated system can be referred to as barge-in.
When barge-in techniques operate too slowly, a user can continue to hear the prompt after providing the barge-in command, which is undesirable. In a particularly undesirable situation, an impatient user may believe input was not received by the automated system and rapidly repeat the barge-in command. The receipt of multiple barge-in commands can further slow down system processing and/or can result in problematic side effects, such as the inadvertent selection of a menu option.
One conventional solution designed to improve barge-in response time, closely couples ASR and text-to-speech (TTS) systems together within voice server software. The close coupling of these components can permit a voice server to halt TTS generated audible prompts immediately upon determining that a speech input is a barge-in command. That is, the voice server can be optimized for barge-in by explicitly designing optimized exceptions for barge-in within source code at a low-level. Such a solution, however, jeopardizes functional isolation of components and violates many generally accepted preferred software practices. Further, such a solution can prevent a voice server component of an IVR system from using remotely located, independently developed TTS and ASR engines. There exists a need for a technique that permits a voice server to perform rapid barge-in operations that does not negatively affect the architectural integrity of the voice server.