1. Field of the Invention
This invention generally relates to the presentation of information, and more particularly, to providing a consolidated presentation of heterogenous information about user sessions in an interactive voice response unit.
2. Description of Related Art
An interactive voice response (IVR) unit interacts with a caller over a telephone connection and provides information or performs functions as selected by the caller. Usually, the IVR plays a pre-recorded voice prompt querying the user to press a key, perform a certain function or to navigate to other options. The array of options available to the caller in this type of system is often referred to as a “menu.” The structure of the menu is often a hierarchical arrangement whereby a user can progress through numerous menu levels. Guided by voice prompts at each level, the user can readily select from among a great many desired functions despite the telephone keypad only having twelve keys.
By using an IVR, a businesses, or, in general any information provider, such as a financial institution, can handle many common requests for information or transactions over the telephone without the intervention of a human employee. This improves the operating efficiency of the business and improves the service to customers for routing requests. It is common for an IVR service to be constantly accessible 24 hours a day. IVR's are applied to automatic order placement, information retrieval, entertainment, and the like, just to name a few examples. Sometimes an IVR is used in conjunction with a call center populated with human operators. The IVR may collect information from a caller initially and then route the caller to an appropriate operator who can provide further assistance.
Recently, speech recognition technology has been integrated into IVR's to form speech recognizing voice response units (SIVR's). During the use of a SIVR, a caller may provide spoken input in response to voice prompts from the SIVR unit. Thus, the caller does not necessarily have use a telephone keypad to provide a response to a prompt. This is especially convenient to users of hand-held phones wherein the keypad is located substantially with the ear piece and mouth piece. With a hand-held telephone, such as a mobile phone, pressing the keys requires momentarily moving the headset away from the user's head so that the user's fingers can reach the keys and the user can see the keys being pressed. This can be hazardous for users who attempt to operate a phone while driving an automobile or performing other activities. The requirement to provide keypad input can also be challenging to persons who have difficulty moving their hands and fingers. Even though a voice response unit can be dialed by a preprogrammed button on a phone, such as a speed dial, the navigation of a menu is typically more variable and requires dynamic choices by the user.
As business and service providers implement SIVR technology, there is a desire to ensure that the quality of service delivered by such installations meets minimum requirements. Companies who serve customers through SIVR systems want to insure the callers can easily and efficiently use the SIVR system without experiencing undue frustration or delay that would reflect poorly on the company. Where a speech recognition system is applied, the recognition accuracy of the speech recognition system is often the limiting factor to the overall performance of the SIVR. This practical limitation is well known in the art. Indeed, to partially compensate for limited speech recognition accuracy, a considerable portion of the skill in implementing a successful SIVR system is in cleverly structuring menus to be forgiving of inaccurate or nonsensical words detected by the speech recognition system.
In practice, speaker-independent and dialect-independent speech recognition has proven difficult to achieve even with a clear, full-bandwidth audio signal from the person speaking. Fortunately, most SIVR installations only rely on spoken digits and yes/no responses or other short responses. In a giving spoken language, a menu maybe designed so that only easily distinguishable words are used at each menu level. This improves the apparent accuracy of the system. There are some SIVR installations that rely on recognizing spoken strings of digits, such as a user account number. In this case, the account numbers are usually not assigned to users sequentially or randomly. Instead, they are carefully chosen so that even that the most similar account numbers are readily distinguishable by at least two or three highly recognizable phonemes or features.
The difficulty of accurate speech recognition is compounded when a remote speaker provides input through an imperfect, limited-bandwidth connection such as a telephone connection. The standard pass-band of a telephone connection is about 300 Hz to 3400 Hz. This prevents the speech recognition system from receiving a vocal fundamental frequency and from receiving high frequency information that distinguishes sibilant sounds. The latter is particularly problematic for accurate speech recognition. Since a human listener can tolerate such impairments by relying heavily on context, the limited bandwidth of a telephone connection is designed to provided merely adequate intelligibility for a human listener. In addition to having limited bandwidth, a telephone channel can also introduce other impairments such as distortion, crosstalk, noise and attenuation. These impairments affect both human and artificial recipients. Furthermore, ambient noise sources, such as automobile traffic and other people can interfere with speech recognition even under otherwise perfect conditions. Some speech recognition technologies may be more robust than others in tolerating these interferences.