Interactive voice response systems have become popular among companies as a cost-effective way of serving their customers. As is known to those of skill in the art, an IVR system is a group of voice recordings organized into menu choices that users can select via voice recognition or the DTMF tones on their touch-tone phones. For example, clients could call up their bank and receive a prerecorded message. Clients could then be given the choice to “Press one for account information”, “Press two for loan information”, or “Press zero to speak with a service representative.” IVR systems can direct customers to the appropriate information or client representative while reducing the staffing needs of the organization.
While IVR systems have reduced the operating costs of companies, IVR systems can still be costly to create, debug and implement. Many IVR systems work on expensive and proprietary systems. Complex menu selections can take a great deal of time to develop and debug. However, the emerging voice XML standard promises to decrease the cost of IVR systems. VoiceXML is a subset of the eXtended Markup Language (XML), a text-based markup language. VoiceXML is designed for creating audio dialogs that feature synthesized speech, digitized voice, recognition of spoken and DTMF key input, recording of spoken input, telephony, and mixed-initiative conversations.
A problem with IVR systems is that they are slow to transmit information. Users can wait for a considerable time listening to all of the options before they can make menu decisions. This delay is not only inconvenient for the customer on the phone, but can be costly for the company providing the IVR service and the telephone carrier, both of whom have to allocate hardware resources during the IVR session. As the call volume increases, so do the operating costs. The company providing the IVR service often requires expensive IVR equipment, large storage capacities for prerecorded messages, a pool of telephone lines to handle concurrent customers, and a telephone switch to transfer customers to the agents.
The telephone carrier also has to provide switching and capacity resources for the calls. As is known to those of skill in the art, IVR messages and customer responses are both traditionally carried on voice channels. Voice channels require large amounts of transmission bandwidth and have low tolerance to latency within the carrier's network. This bandwidth must be budgeted while the customer passively listens to the IVR message. These resource costs are particularly acute for the providers of wireless telecommunications service, where bandwidth is at a premium.
Some effort has been made to reduce the operating costs of an IVR system. For example, the European Telecommunications Standards Institute (“ETSI”) is working on distributed speech recognition (“DSR”). DSR (ES 201 108) attempts to use a data channel to send a representation of the speech rather than the speech itself over a voice channel. Computer processing, which transforms the speech into data at one end and the data back into speech at the other, is distributed between the customer's hardware and the company's hardware. Yet DSR's primary function is to improve voice recognition accuracy, which does not directly address the problem of good utilization of bandwidth. Furthermore, the DSR project does not address any of the functionality of an IVR system beyond speech recognition.
It is therefore desired to have a system, apparatus and method to deliver interactive voice response services in a more efficient manner.