1. Technical Field
This invention relates to the field of computer software systems and more specifically to a method for allowing a speech navigator to efficiently execute a plurality of functions, based upon only a single spoken command.
2. Description of the Related Art
In recent years, various software systems have been developed to enable an application program executing on a computer to recognize and respond to voice commands. Such programs are advantageously designed as independent or stand-alone systems which provide voice recognition capabilities to existing commercially available target application programs. Thus, sophisticated voice recognition capability can be economically made available for a wide variety of commercial application software, without modifying the existing source code of such application software.
Voice recognition systems are designed to allow user data to be entered in a target application program by means of spoken words (e.g. dictation of a report in a word processing application program). In addition, some systems also enable such target application programs to respond to voice commands for controlling the software (e.g., opening and closing windows, choosing program options, and causing the application software to perform certain functions). Systems which allow voice control of a target application program are sometimes called voice navigators. Significantly, the design of an independently developed voice navigator system, which is capable of associating voice commands with equivalent keyboard or mouse actuated control functions for a wide variety of commercially available application programs, has been hindered by certain difficulties.
Conventional voice navigation programs are typically designed to dynamically analyze a window object. This analysis is generally performed in order to determine a command vocabulary set for controlling such objects and their associated macros. In order to perform this dynamic analysis, there are several features of every window in a target application that the speech navigator can probe to determine the attributes of a particular object. These features include the (1) window class name, (2) window text, and (3) window identification number. The window class name indicates the type of the object (e.g., "BUTTON", "LISTBOX", "EDIT BOX", or "SCROLLBAR"). The window text feature is specific text associated with a window which allows a application program user to understand the function or relevance of a particular window. Conventional navigators will determine how to use the window text based upon the class name. For example, if the class name is "BUTTON" the window text of the button would be the words which would normally appear on the face of the button. Accordingly, the navigator would use the window text to determine the spoken command which can be used to activate the button. In other words, by probing the target application program regarding the window text, the navigator can associate certain spoken text to a particular button or control. Examples of window text might include words such as "OK" or "CANCEL" in the case of a push-button, or a list of items in the case of a list box. Finally, the navigator may also probe the application program for the window identification number as a way to internally distinguish controls which may otherwise look similar. The window identification number uniquely identifies a child window from other child windows having the same parent window.
In current voice recognition navigator systems, each voice command typically represents one user action. These actions could be a series of keystrokes, mouse click events, or other macro implementations. A macro is a single command phrase which causes a pre-recorded sequence of actions to take place. Typically the pre-recorded sequence of actions are keystrokes or mouse click events. One problem with conventional type dynamic analysis navigators is that their basic design does not easily permit macros associated with one user command to be combined with macros associated with other user commands. This is because, in the case of dynamic analysis navigators, a vocabulary set and its associated macros are available to a user only when a particular screen object associated with such vocabulary and macro is the foreground object. As a result, more complex, multi-step macros, must be provided for each window which can be acted upon, or a user must articulate multiple commands under circumstances where one command could otherwise be used. If additional multi-step macros are provided for each window object, it will have the undesired effect of increasing the amount of memory which is required to store the navigator program. Further, providing a large number of complex macros for each screen object to be controlled by a navigator program causes the program to be more complex, more expensive to develop, and more prone to errors. Alternatively, if a voice navigator is capable of responding only to single action type commands, the voice navigation process may become time consuming and tedious.
Accordingly, it would be desirable to allow a sequence of macros to be executed based on a single voice command. It would further be desirable to minimize the amount of memory required to store a navigator program containing macros, and render the development of such programs more efficient. Finally, it would be desirable to minimize the number of commands which must be articulated by a user in order to perform certain actions with a voice navigator.