1. Field of the Invention
The present invention relates to the field of speech processing, and, more particularly, to placing grammar specific help including available speech commands within speech grammars.
2. Description of the Related Art
Multimodal interactions occur through a computing interface having multiple redundant interaction modes through which a user can interface. Typical modes for a multimodal interface include a graphical user interface (GUI) mode and a speech mode. Both input and output can be sent and received through either mode.
The speech mode can be particularly important when a multimodal application executes upon a computing device that has limited or inconvenient input/output peripherals attached. This is particularly true for mobile, embedded, and wearable computing devices.
For example, many smart phones include a touch screen GUI and a speech interface. The speech interface can receive spoken input that is automatically converted to text and placed in an application, such as an email application or a word processing application. This spoken input mechanism can be significantly easier for a user than attempting to input a textual message using a touch screen input mechanism included with the GUI mode of the device. Additionally, the device may be utilized in an environment where a relatively small screen (due to the mobile nature of a portable device) is difficult to read or in a situation where reading a display screen is overly distracting. In these situations, textual output can be converted into speech and audibly presented to a user.
One challenge with utilizing multimodal applications relates to permitted speech commands. Different commands can be selectively available depending on a state of a multimodal application. Other speech commands can be available independent of the application state. Different ones of the speech commands can be considered global commands for an application, other available speech commands can be page-level commands dependant upon a displayed window or page of the application, and still other speech commands can be context specific commands dependent upon an interface item currently possessing interface focus. Global commands can be relatively static, while the page-level commands and the context-specific commands can be dynamic. A multimodal application must provide help for all of these different types of commands.
Current techniques for providing help for multimodal applications utilize traditional coding techniques of uniquely constructing help within code linked to the multimodal application and events occurring within the multimodal application. For example, the help can be integrated within a general help file for the application. One problem with this approach is that the speech commands and help code are integrated at a relatively deep level of the application (since available speech commands can change depending upon application state). When code modifications are made to the application, the links to the help files must also be altered and tested. Additionally, when the speech grammar used to programmatically interpret the speech commands changes, corresponding changes must be made to the multimodal application and associated help files.
Problems with maintaining application/help/grammar synchronization are aggravated by emerging software development technologies, such as service oriented architecture (SOA) technologies, which componentized software functionality into discrete units having well defined interfaces. In a SOA, different groups and/or companies typically focus on providing code units that can be combined with code units independently developed by others. Instead of an atomically controlled development environment, a SOA encourages a distributed development environment that results in integrated software products from a multitude of independently developed software building blocks. A SOA can have advantages of improved time to market, massive software reutilization, and a graceful upgrade progression. A SOA can also challenge traditional software design methodologies. For example, it can be difficult to integrate SOA software units with software having low-level code dependencies.
A new approach is needed to implement help within voice-enabled and multimodal applications. The new approach will ideally be capable of working with multimodal applications developed using any software technologies, including SOA based technologies. Further, the approach should be easy to update and maintain as a speech enabled application and/or speech grammars are updated. Moreover, an optimal approach would permit help files to be ported across different applications so that a single help technology can be utilized for both voice-enabled and multimodal applications developed for different platforms.