1. Technical Field
The present invention relates in general to the navigation of a pointer device, such as a mouse pointer or a pointer tablet, in a graphical user interface of an application program by speech controlled input commands. More particularly, the present invention relates to a method and system for speech controlled navigation of such a pointer.
2. Description of the Related Art
The increasing demand for simplicity of operation of a computer has substantially determined the development of graphical user interfaces (GUIs) for the operation of application programs in a computer operating system or the operating systems themselves. A further improvement of operability has been accomplished by the development of powerful speech recognition systems which provide support for technologically unskilled or physically handicapped users of a computer. In such systems, the necessary control commands are recognized by a recognition device and converted into the respective control commands. There are a number of speech recognition systems available, like the VOICE TYPE DICTATION SYSTEM (VTD) by IBM. This product has been the subject matter of a number of patents, for instance U.S. Pat. No. 4,718,094 entitled "Speech Recognition System" assigned to the present assignee, which can be referred to for more detailed aspects of speech recognition. VTD consists of speaker dependent and speaker independent software modules and particularly provides a macro editor. The main program is a speech recognition engine which can serve only one application program at a given time. The actual active application window is marked by VTD by means of a colored frame. From a window called "speech control" the user can gather all commands which are currently available.
A first particular feature of VTD is an application programming interface (API) which allows pre-developed application programs to interact with the speech recognition engine of VTD and thus enables integration of speech input eliminating the need for development of a new speech-enabled application program. A second feature of VTD is the above mentioned macro editor which enables a user to create their own macros which allow execution of a number of actions (commands) by uttering only one macronym.
By use of VTD, uttered words can be interpreted and executed as commands (so-called "command mode"), or be recognized as words of a text document (so-called "dictation mode") . Dependent on the underlying computer hardware, the dictation system can perform the recognition process nearly in real time, i.e. the recognized words are displayed on the screen with only a short delay.
A typical range of applications for speech recognition systems would include the acoustic input: of text documents on one side, and on another side, the speech-supported control of application programs of a computer system. A particular requirement of speech controlled operation of an application program is the capability of unrestricted positioning of a pointer in its viewport presented on the GUI. In the case of acoustic input of text documents there exist specific dictation applications which either store the input text as a file, or transfer the input data, via a intermediate buffer storage of the graphical user interface, to another application like a word processing system, in order to post-process the respective file.
For speech controlled processing of an application, in general, so-called macro interfaces are provided which enable adaptation of existing application programs to a speech recognition device without requiring amendments of the application programs themselves. By means of the previously mentioned macro interface, for instance, the program commands or the respective program menu items of the application program are associated with word sequences spoken by the user. For operation of a window system of an application program, e.g. for opening a program file or starting a standard editor, macros are generally predefined.
However, for adaptation of existing application programs to speech controlled operation by means of the above cited macro interface, several restrictions are imposed. The best use of such an interface is the operation of program commands and/or program menu items by an association of word sequences. In such a system, the execution of common window commands such as "minimize", "maximize" and "close" is practicable, and is supported by predefined macros in most speech recognition systems.
Substitution of a pointer based program operation by speech input commands still remains a troublesome undertaking. This is particularly valid for positioning a pointer to a certain point of an application window as would be performed by a movement of a mouse pointer. This kind of operation is a necessary requirement for the operation of a number of application programs such as graphical editors, annotation tools, and table calculation programs.
A known solution to the presented problem are application programs which provide positioning of a mouse pointer via arrow keys of a keyboard. By assigning speech commands to those arrow keys, the mouse pointer may be moved. This approach is rather restrictive in that mouse pointer velocity control in this kind of program is difficult to use and further in that it lacks a user-friendly interface.
Another known solution is the combined use of speech input and mouse navigation for operation of an application program. This approach, of course, is useful for users who are not familiar with mouse operations, but it is not applicable for handicapped users for whom speech input is the only available method of information input to a computer system.
Therefore, the underlying problem the present invention addresses is the need for a navigation method and system for a pointer in a graphical user interface which is controlled solely by speech command input.