The present invention relates to a VoiceXML browser and supporting components for mobile devices. More particularly, the present invention is directed to a system and method for facilitating user interaction with voice applications using either locally stored applications or those accessible via the wireless or mobile broadband capabilities of a mobile device.
In telephony, Interactive Voice Response (IVR) is a technology that allows a computer to detect voice and touch tones in a telephone call.
Many companies employ systems based on IVR technology to process and route telephone calls originating from their respective customers. Examples include telephone banking, televoting, and credit card transactions. IVR systems are typically used to service high call volumes, reduce cost and improve the customer experience.
If a customer dials a telephone number that is answered by an IVR system, the system executes an application that responds to the customer/caller with pre-recorded or dynamically generated audio files. These audio files explain the options available to the caller and direct the caller on how to proceed. The caller selects an option by using spoken words or Dual-Tone Multi-Frequency (DMTF) tones, e.g., telephone keypad touch tones.
Modern IVR applications are structured similar to World Wide Web pages, using languages such as VoiceXML. Other languages may include, for example, SALT or T-XML.
Since many companies do not have their own IVR platforms, they typically turn to outsourcing companies or vendors to either host their VoiceXML application or manage the application as a whole. An example of such a hosted environment is shown in FIG. 1.
The hosted environment shown in FIG. 1 may include end user devices, such as a mobile device 105 or a land-line phone 110; hosted vendor systems 115; and client systems 120. The mobile device 105, such as a cellular phone, PDA, or iPhone, and/or the land-line phone 110 may communicate with the hosted vendor systems 115 via a telephony interface 125. The telephony interface 125, in turn, interacts with a VoiceXML browser 130, a MRCP TTS Server 135, and a MRCP Speech Recognition Server 140, all of which are part of the hosted vendor systems 115.
The VoiceXML browser 130 may be an extension of a web browser that presents an interactive voice user interface to the user and that operates on pages that specify voice dialogs. These pages may be written in VoiceXML language, which is the W3C's standard voice dialog markup language, but other proprietary voice dialog languages may be used. The VoiceXML browser 130 may present information aurally, using pre-recorded audio file playback or using Text-To-Speech (TTS) software to render textual information as audio. Further, the VoiceXML browser 130 may obtain information from the end user of the mobile device 105 and/or the land-line phone 110 by speech recognition and keypad entry, e.g., DTMF detection.
The VoiceXML browser 130 interacts with the MRCP TTS Server 135 and the MRCP Speech Recognition Server 140. MRCP stands for Media Resource Control Protocol, which is a communication protocol that allows speech servers to provide various speech services, such as speech recognition, speech synthesis, and TTS to its clients. The MRCP TTS Server 135 provides TTS services to its clients, and the MRCP Speech Recognition Server 140 provides speech recognition services to its clients.
Computer Telephone Integration (CTI) data are sent from the hosted vendor systems 115 to a CTI Management Server 145. CTI is a technology that allows interactions on a telephone and a computer to be integrated or coordinated. As contact channels have expanded from voice to email, web, and fax, CTI has expanded to include the integration of all customer contact channels (voice, email, web, fax, etc.) with computer systems. Common functions that may be implemented using CTI are, for example, call routing, call information display with or without using calling line data, phone control (answer, hang up, hold, conference, etc.), automatic dialing and computer-controlled dialing, etc.
Furthermore, application requests are sent from the VoiceXML Browser 130 to a VoiceXML Application Server 150, and the requested VoiceXML application is delivered from the VoiceXML Application Server 150 to the VoiceXML Server 130. The CTI Management Server 145 and the VoiceXML Application Server 150 are both part of the client systems 120.
More and more people use intelligent mobile devices, such as cellular phones, PDAs, or iPhones, as a means of communication. These intelligent mobile devices become more and more sophisticated due to, for example, increased computing power or memory capacity, and due to, for example, the availability of mobile Software Development Kits (SDKs), such as Java Platform, Micro Edition (Java ME) or Apple's iPhone SDK. This may lead to decreased reliance on teleservices companies that are built on standard telephony technology. More particularly, this may lead to decreased reliance on hosted environments for IVR applications, for example.