1. Field of the Invention
The field of the invention is data processing, or, more specifically, methods, systems, and products for displaying speech command input state information in a multimodal browser.
2. Description of Related Art
User interaction with applications running on small devices through a keyboard or stylus has become increasingly limited and cumbersome as those devices have become increasingly smaller. In particular, small handheld devices like mobile phones and PDAs serve many functions and contain sufficient processing power to support user interaction through other modes, such as multimodal access. Multimodal devices that support such multimodal access combine multiple user input modes or channels in the same interaction allowing a user to interact with applications on the device simultaneously through multiple input modes or channels. The methods of input include speech recognition, keyboard, touch screen, stylus, mouse, handwriting, and others. Multimodal input often makes using a small device easier.
A multimodal application is an application capable of receiving multimodal input and interacting with users through multimodal output. Such multimodal applications typically support multimodal interaction through hierarchical menus that may be speech driven. Such speech driven menus have a grammar that is subdivided to provide a limited grammar at each tier of the hierarchical menu. Such subdivided limited grammars are assigned to a particular tier in the hierarchical menu that corresponds to the menu choices presented to a user at that tier. A user may navigate each tier of the menu by invoking speech commands in the limited subdivided grammars of that tier that correspond to the menu choices before the user. Only the limited grammars corresponding to the user's current menu choices are typically enabled and therefore available as speech commands for the user. An application will not accept as a speech command an utterance that does not contain words in the currently enabled grammar.
In order to empower a user to properly use speech commands, information describing a currently enabled grammar should be communicated to the user so that the user may make an informed word choice in invoking speech commands. Conventional multimodal web pages convey information describing the current enabled grammar by displaying through text or speech a text file that contains the actual words of the grammar. That is, such web pages simply display the actual words or phrases of the grammar. As multimodal devices become smaller, there is less and less screen space available for displaying the contents of the grammar.
It is not only important to communicate the contents grammar to the user, but it is also important to communicate the input state of a particular speech command. The input state of the speech command describes the current status of a multimodal application with regard to the particular speech command. Input states of speech commands include for example, ‘listening’ indicating the multimodal application is currently accepting from the user a particular kind of speech command; ‘inactive’ indicating that the multimodal application is not currently accepting from the user a particular kind of speech command; ‘filled’ indicating the multimodal application has already accepted from the user a particular speech command and others that will occur to those of skill in the art. It would be helpful to users of multimodal applications if there were a method of displaying speech command input state information to a user that adequately provided the user information concerning the kind of words contained in an active grammar and adequately provided the user with information describing input state of speech commands.