Various services now provide voice and non-voice access to Internet data. A caller may access a “Voice Portal” or “Voice Site” by simply dialing a number advertised by the company providing the Voice Access service. The caller will hear a greeting that requests the caller to “speak” or “enter” specific commands. As an example, a caller may ask the system to provide him/her with the latest weather information by simply speaking a command, or pressing a DTMF button on the phone. The information provided to the user may be pre-recorded and accessed from a database, or it may be accessed from a page similar to those available on the Internet. The mark-up language used to code the page may be VoiceXML or any other type of XML-based coding language. Some legacy systems may use proprietary or less commonly used methods for connecting the system to back-end data servers.
However, in all existing systems, users interact with data only through one interface, that is, either a voice interface (e.g., a telephone) or a data interface (e.g. an Internet browser). This single mode interaction causes limitations on delivery of services to users. As an example, a user who is driving a car may ask for address information between point A and point B by issuing voice commands, and hear back the directions read to him via a speaker in the car. However, the same navigation information would not be available in graphical format. Another example is a user who is using a data-enabled mobile phone to review his investment portfolio. The user may wish to see the data, but input the queries by simply speaking them into the phone. Current systems do not allow for such capability.
Another limitation of existing systems is that they do not allow more than one user to interact with an application in one session. The present invention makes this possible. One example of where this may be required is a cooperative form filling application where two users need to be logged onto the same session, and each answers specific questions as they are presented. The present invention makes it possible for the two attendants to call into the system, and interact with the same application through a single session, thereby filling one form by two users.
The problem that arises in multi-modal or multi-user interaction with a single session (as in the above examples) is that multiple input values may be received for the same query through different channels. A simple solution would be to accept the first chronologically arriving input value, and discard the subsequent ones. This solution, however, fails when there are many rounds of query-input in the same application. Consider the case of a query A followed by two inputs a-1 and a-2. Input a-1 is accepted, but before input a-2 arrives in the system, another query B is made. Now input a-2 arrives in the system followed by a valid input b-1. The system would accept false input a-2, and discard valid input b-1. FIG. 1 illustrates when “Accept First Input” fails in the case of multiple queries and inputs. Throughout the FIGS. 1, 2 and 3, the sunburst symbol 12 represents an accepted input, and the crossed-out symbol 14 represents an incorrectly accepted input or an incorrectly discarded input, for illustrative purposes.
The solution to this problem is to identify every input with the name of the query that it is attempting to address. In this case, the system would know that the second a-2 input is not intended for query B, would discard it, and would accept the valid input b-1.
However, this solution also falls short when the same dialog is repetitively used. For example if the system makes a query A for the first time (designated as A1). Two responses a1-1 and a1-2 are sent back. Response a-1 is accepted as valid, but before response a1-2 arrives, the system repeats the same dialog, repeating query A (designated as A2). User(s) reply with a response a2-1. However, false response a1-2 arrives first, is accepted as valid, and valid input a2-1 is discarded as invalid. FIG. 2 “Accept Tagged Input” fails when the same dialog is repeated.