Many enterprises employ an interactive voice response (IVR) system that handles calls from telecommunications terminals. An interactive voice response system typically presents a hierarchy of menus to the caller, and prompts the caller for input to navigate the menus and to supply information to the IVR system. For example, a caller might touch the “3” key of his terminal's keypad, or say the word “three”, to choose the third option in a menu. Similarly, a caller might specify his bank account number to the interactive voice response system by inputting the digits via the keypad, or by saying the digits. In many interactive voice response systems the caller can connect to a person in the enterprise by either selecting an appropriate menu option, or by entering the telephone extension associated with that person.
FIG. 1 depicts telecommunications system 100 in accordance with the prior art. Telecommunications system 100 comprises telecommunications terminal 101, telecommunications network 105, private branch exchange (PBX) 110, and interactive voice response system 120, interconnected as shown.
Telecommunications terminal 101 is one of a telephone, a notebook computer, a personal digital assistant (PDA), etc. and is capable of placing and receiving calls via telecommunications network 105.
Telecommunications network 105 is a network such as the Public Switched Telephone Network [PSTN], the Internet, etc. that carries calls to and from telecommunications terminal 101, private branch exchange 110, and other devices not show in FIG. 1. A call might be a conventional voice telephony call, a text-based instant messaging (IM) session, a Voice over Internet Protocol (VoIP) call, etc.
Private branch exchange (PBX) 110 receives incoming calls from telecommunications network 105 and directs the calls to interactive voice response (IVR) system 120 or to one of a plurality of telecommunications terminals within the enterprise, depending on how private branch exchange 110 is programmed or configured. For example, in an enterprise call center, private branch exchange 110 might comprise logic for routing calls to service agents' terminals based on criteria such as how busy various service agents have been in a recent time interval, the telephone number called, and so forth. In addition, private branch exchange 110 might be programmed or configured so that an incoming call is initially routed to interactive voice response (IVR) system 120, and, based on caller input to IVR system 120, subsequently redirected back to PBX 110 for routing to an appropriate telecommunications terminal within the enterprise. Private branch exchange (PBX) 110 also receives outbound signals from telecommunications terminals within the enterprise and from interactive voice response (IVR) system 120, and transmits the signals on to telecommunications network 105 for delivery to a caller's terminal.
Interactive voice response (IVR) system 120 is a data-processing system that presents one or more menus to a caller and receives caller input (e.g., speech signals, keypad input, etc.), as described above, via private branch exchange 110. Interactive voice response system (IVR) 120 is typically programmable and performs its tasks by executing one or more instances of an IVR system application. An IVR system application typically comprises one or more scripts that specify what speech is generated by interactive voice response system 120, what input to collect from the caller, and what actions to take in response to caller input. For example, an IVR system application might comprise a top-level script that presents a main menu to the caller, and additional scripts that correspond to each of the menu options (e.g., a script for reviewing bank account balances, a script for making a transfer of funds between accounts, etc.).
A popular language for such scripts is the Voice eXtensible Markup Language (abbreviated VoiceXML or VXML). The Voice eXtensible Markup Language is an application of the eXtensible Markup Language, abbreviated XML, which enables the creation of customized tags for defining, transmitting, validating, and interpretation of data between two applications, organizations, etc. The Voice eXtensible Markup Language enables dialogs that feature synthesized speech, digitized audio, recognition of spoken and keyed input, recording of spoken input, and telephony. A primary objective of VXML is to bring the advantages of web-based development and content delivery to interactive voice response system applications.
FIG. 2 depicts an exemplary Voice eXtensible Markup Language (VXML) script (also known as a VXML document or page), in accordance with the prior art. The VXML script, when executed by interactive voice response system 120, presents a menu with three options; the first option is for transferring the call to the sales department, the second option is for transferring the call to the marketing department, and the third option is for transferring the call to the customer support department. Audio content (in particular, synthesized speech) that corresponds to text between the <prompt> and </prompt> tags is generated by interactive voice response system 120 and transmitted to the caller.