The present invention generally pertains to voice-activated command systems and speech recognition applications. More specifically, the present invention pertains to methods and apparatus for determining a position of a user barge-in in response to a query list from the speech recognition application.
It is very common in speech applications to ask users to go through some lists of items. For example, in a voice-dialer or name-dialer application, the voice-dialing system typically uses an introductory message to greet a caller, and to inquire from the caller who they would like to contact. The caller then speaks the name of the person he or she wishes to contact, and the voice-dialing system uses a speech recognition technique to identify or recognize the names of one or more potential call recipients which hopefully include the caller's intended call recipient. In some voice dialing systems, the voice dialing application then typically asks the caller to pick the correct name from the speech recognition engine's suggested N-best alternatives, or to select the correct recipient in the case of name collisions (names with identical spellings or names which are homonyms). A usability study strongly shows that most callers prefer to barge-in a “Yes” after they hear the correct item.
One problem experienced by speech applications which rely on user barge-ins to select one of a list of choices is that it is difficult to determine the location of the user barge-in in many instances. For instance, consider the following example exchange between a voice-command system and a user:                System: Please select one from the following five people.        System: Number one, Jeffrey Olson        System: Number two, Jeffrey Ollason        System: . . .        User: Yes (Barged-in)In this example, the user might think he/she barged-in on the second item. However, at that time the system had started the third prompt (even though the caller hadn't heard the phrase “Number three” yet) As a result, the intention of the caller may have been misrecognized.        
The capability of robustly determining the location of barge-ins can help provide efficient and user-friendly voice user interfaces in such scenario. However, most speech platforms either cannot provide a robust and accurate prompt bookmark, or do not provide this bookmark feature at all.
The present invention provides solutions to one or more of the above-described problems and/or provides other advantages over the prior art.