The present invention relates generally to communication systems, and more specifically to a system and method for providing voice communications over a global communication network (“Web”).
A number of existing systems have been designed to provide voice communications over the Web. Recently, what has been referred to as a Voice Browser platform has been used for execution of VoiceXML (Voice extensible Markup Language) scripts in connection with various specific types of voice enabled applications executing on an application server. In a typical telephonic user interaction using such an existing architecture, a VoiceXML script executes on the Voice Browser to support a dialog with a user. During the dialog, various voice prompts may be provided, and the user provides response data that is captured and stored. In a common scenario, when the user has entered sufficient data to complete a form, a SUBMIT command is executed through the Voice Browser, causing an HTTP (HyperText Transport Protocol) transaction to occur, often resulting in another VoiceXML script being selected for execution.
The VoiceXML language processed by the Voice Browser includes many commands (“tags”) for supporting a user dialog. These include commands for rendering of aural data, for example by providing recorded and/or synthesized voice prompts, as well as commands for accepting different types of input data, for example by receiving and processing voice and DTMF (DualTone Multi-Frequency) data. VoiceXML also includes a number of telephony commands relating to call control actions. Call control refers to the ability of executing scripts to control a connection with the user. Call control actions performed through VoiceXML commands executed in the Voice Browser include various types of call transfers. Call transfer actions may include simply transferring the user to another destination, transferring the user to another destination and dropping out of a call if the transfer is successful, and/or transferring the user but staying in the call, either to retrieve the user at the end of their interaction with the remote destination, or to monitor the call for events such as the user speaking a keyword or pressing a special key.
In the existing Voice Browser architecture, both call control and voice rendering functionality are provided through execution of VoiceXML scripts within the Voice Browser platform. The VoiceXML scripting language, like similar scripting languages such as HTML (HyperText Markup Language), is well suited to rendering data. As is generally known, HTML is designed for development of scripts that are primarily used to render visual data. VoiceXML is intended for development of scripts relating to voice-driven interactions. Accordingly, many of the commands in VoiceXML are designed to support rendering and reception of voice dialog data. However, the procedural logic needed for many call control actions is not well supported using VoiceXML. For example, the syntax of the <if> command is convoluted in VoiceXML, and none of the standard structured programming constructs, such as “for”, “while”, and “until” are provided. In particular, supporting the VoiceXML <transfer> command to transfer a user to a new destination, such as a remote call center, is problematic. The various state machines associated with the different call signaling mechanisms that must be supported in this regard are difficult to implement using VoiceXML. Moreover, handling error cases in VoiceXML script for such call control state machines is awkward and excessively complex.
For these reasons and others, it would be desirable to have a system for providing voice communications over a network that does not combine call control and voice rendering functionality within VoiceXML scripts executed in a Voice Browser. The system should advantageously enable the use of procedural programming constructs for supporting call control actions, while efficiently processing VoiceXML for dialog rendering purposes.