1. Field of the Invention
The present invention is related to the field of computing devices, and, more particularly, to interactive applications for computer devices.
2. Description of the Related Art
A voice-mode interactive application is a type of modal application by which a user of a computing device can interact with the computing device through speech-based input and output. Accordingly, a voice-mode interactive application typically includes both a speech recognition component as well as a speech generation component. The speech recognition component allows the user to supply input to the computing device in the form of speech utterances. The speech generation component generates speech output in the form of pre-recorded voice playback and/or synthetic speech generated by a text-to-speech (TTS) device.
The voice-mode interactive application provides the grammar, sequence, context, and other parameters for the user carrying out an interactive dialog with the computing device. An interactive dialog is typically designed to accomplish a specific user-directed task or to perform a specific set of user-directed functions. These tasks and functions vary widely.
A voice mode interactive application offers several advantages, not the least of which is that a user does not necessarily need a keyboard or other non-voice input device to accomplish a task using the computing device. Nonetheless, there are circumstances in which a mode of interaction other than voice mode is desirable. For example, when the computing device is located in a noisy environment a user may prefer a visual mode to a voice mode.
Additionally, there are instances in which a user may prefer to use more than one mode. Applications that allow a user to interact with a computing device by supplying input and receiving output through a plurality of modalities are commonly referred to as multimodal applications. The different modalities that can be supported by a multimodal application include speech, audio, visual, graphical, textual, and other modalities. Multimodal applications, moreover, permit more than one modality to be active at any given time.
Relatively few efficient techniques currently exist for converting existing single modality applications into multimodal ones. Relatedly, there are few efficient techniques for transforming single modality applications customized for voice inputs and outputs to applications having a visual modality. It follows, therefore, that it would be advantageous to provide a way to create an alternate-mode application, whether founded in a single or multiple modality, for carrying out a user-directed task for which only a voice-mode application initially exists.