One of the forefronts of computing technology is speech recognition, because people often find speech to be familiar and convenient way to communicate information. With computerized applications controlling many aspects of daily activities from word processing to controlling appliances, providing speech recognition based interfaces for such applications is a high priority of research and development for many companies. Even web site operators and other content providers are deploying voice driven interfaces for allowing users to browse their content. The voice interfaces commonly include “grammars” that define valid utterances (words, terms, phrases, etc.) that can occur at a given state within an application's execution. The grammars are fed to a speech recognition system and used to interpret the user's voice entry.
Conventional voice response systems often make use of a rigidly structured series of questions to extract multiple pieces of information. For example, directory assistance applications typically ask for (and may confirm) the city and state for a listing before asking for the name of the listing requested. Such rigid structures mean that more interactions are required with the user. In cases where there may be some significant latency—for example communication with a speech recognition/search application over a cellular data network may have latencies of many seconds—these extra turns are undesirable.