1. Field of the Invention
The present invention relates to telecommunications systems that process speech input from a user with a computer. More particularly, the invention relates to methods for transferring voice command functions including application-level grammars and system-level grammars from a first voice command platform to a second voice command platform. The method enhances the user experience by increasing the likelihood of the second voice command platform correctly interpreting the speech input from the user and by increasing the likelihood of executing the event that the user intended.
2. Description of Related Art
A voice command platform (VCP) is a computer-implemented system that provides an interface between speech communication with a user and voice command applications. Generally, a person can call the voice command platform from any telephone and by speaking commands, can browse through voice command applications and menu items within the voice command application. The voice command platform allows the user to access and interact with information maintained by the voice command applications.
The voice command platform can thus receive spoken commands from the user and use the commands to guide its execution of voice command applications, and the voice command platform can interact with a user as dictated by logic in the voice command applications. The voice command platform includes a browser in order for the voice command platform to execute the logic defined by the voice command application. The browser includes an interpreter which functions to interpret the logic (such as in VoiceXML documents) so as to allow the voice command platform to effectively communicate with a user through speech.
A voice command application can be written or rendered in any of a variety of computer languages. One such language is VoiceXML (or simply “VXML”). VoiceXML is an XML-based markup language defined through the W3C consortium. VoiceXML is used to create voice user interfaces. VXML is a tag-based language similar to Hyper Text Markup Language (HTML) that underlies most Internet web pages. Other analogous languages, such as SpeechML, VoxML, or SALT (Speech Application Language Tags), for instance, are available as well.
A voice command application will usually specify which words or “grammars” a user can speak in response to a prompt. In order to identify words in the incoming speech, the voice command platform includes a speech recognition (SR) engine. The SR engine will typically include or have access to a pronunciation dictionary database of “phonemes,” which are small units of speech that distinguish one utterance from another. The SR engine will then analyze the waveform represented by the incoming digitized speech signal and, based on the dictionary database, will determine whether the waveform represents particular words, i.e., acceptable “grammar”.
For instance, if a voice command application allows for a user to respond to a prompt with the grammars “sales,” “service” or “operator”, the SR engine may identify the sequence of one or more phonemes that makes up each of these grammars respectively. The SR engine may compare a phoneme representation of the spoken utterance to a phoneme representation of each allowed grammar. Once the SR engine finds a match (or a best match), the voice command platform may continue processing the application in view of the user's spoken response.
An application written in VoiceXML can be accessed through a VoiceXML interpreter otherwise known as a voice browser. Grammar is typically specified at the application level. The voice browser also has browser-level grammar. The grammar, at both browser and application levels, consists of a set of individual grammar elements. The global grammar elements in the voice browser are assigned to specific tasks, which are handled by event handling routines consistently across all applications. Each spoken grammar is associated with a particular response or reaction. The response can include accessing certain XML documents or linking to different voice command platforms.
Consider for example a user calling into a first voice command platform that functions as a central agent for receiving and processing orders for fresh flowers. The first voice command platform is created and managed by a flower company. The user provides spoken information as to name and address of the recipient, the type of flowers, the date of delivery, etc., in response to voice prompts. At some point they may be prompted to indicate the manner of shipping. Suppose, in response to the prompt, they speak “FEDEX”. FEDEX is a global grammar element in the first voice command platform that is associated with a link to the FEDEX delivery service voice command platform. Thus, in response to the spoken grammar FEDEX, the flowers voice command platform transfers the call to the FEDEX voice command platform for the user to make their delivery arrangements. This way, the shipping company FEDEX can control the user experience when a person makes arrangements to use that company's services.
As another example, consider the situation where the user provides their flower order and is asked whether they wish to include a card. The user might be prompted to speak “HALLMARK” if they want to include a card with the flowers. The speech HALLMARK is also a global grammar element which has a reaction of a transfer of the call to the Hallmark Greeting Card company voice command platform for ordering a greeting card.
Later in the call, the caller could be transferred back from VCP2 to VCP1, e.g., to change the flower order, or to further, downstream VCPs (such as the VCP of the shipping company after ordering flowers and a card, or to the VCP of a candy company such as Whitman Chocolates to order candy with their flowers). These transfers between voice command platforms are generally designed to be as smooth and transparent as possible to the user.
A problem can occur in this arrangement, however: if the system-level grammars or root-level grammars on VCP2 do not include or function in the same way as the system-level grammars or root-level grammars on VCP1, then the user may suddenly no longer be able to invoke functions that the user was previously able to invoke, or the user may suddenly no longer be able to invoke those functions by speaking the same grammars.
Consider the example of VCP1 (the flowers company voice command platform) that defines the system-level grammar “HALLMARK” that a user could speak in order to navigate to a greeting card provider. Consider further that the caller transferred first to the FEDEX voice command platform for shipping arrangements (VCP2). The FEDEX VCP may not recognized the global grammar element HALLMARK (since it is a shipping company and not a flower company). If the user were to speak HALLMARK while their call is being handled by the FEDEX VCP in order to attempt to also include a card with their order, the FEDEX VCP would not recognize the grammar and would give an out-of-grammar error message.
Consider also the possible situation where the flower customer interacts with the flower company voice command platform and orders flowers, and then orders a card by speaking HALLMARK in response to a prompt. They are now transferred to the Hallmark Greeting Card company voice command platform. If they were to speak the term FEDEX, the speech FEDEX might not be recognized in the Hallmark Greeting Card voice command platform. As another example, if, while they are at the Hallmark voice command platform, they were to speak “LOCAL DELIVERY” e.g. to have the order of card plus flower special delivered by a local florist near the recipient, the Hallmark voice command platform may not recognize it.
There is a need in the art to meet user expectations in terms of recognizing speech input when a call is transferred from one voice command platform to another. There is a need for consistency across diverse voice command platforms that have different global grammar due to their different services, features, and functions. The present invention helps improve the user experience and allow their expectations to be met by providing methods and systems for transferring voice command platform functions and grammars from one voice command platform to another. For example, when the user is transferred from the flowers voice command platform to the greeting card voice command platform, the transfer is accompanied by a transfer of the global grammar elements FEDEX, LOCAL DELIVERY, BOUQUET, GREETING CARD, CANDY and other global grammar elements that the user would expect to be there while they are completing their flower order, regardless of the fact that they were have been transferred out of the flower voice command platform to a greeting card or shipping company voice command platform. This transfer could also be accompanied by context information such as the particular location in the menu of the first voice command platform they were at when the transfer occurred, so that when they are transferred back they are transferred back to the location where they were at the time of the transfer.