With the ever increasing use of computers by people of all walks of life, developers of computer systems have had to design computer interfaces which interact with the user in an intuitive and effective manner. In an attempt to effectively communicate with all types of external users, designers of many computer systems have implemented user interfaces which respond to voice commands. For examples of such interfaces, refer to IBM Voice Type Dictation 1.1 for OS/2 and IBM Voice Type Dictation 3.0 for Windows '95.
As is known in the data processing art, a speech recognition apparatus required to implement such user interfaces which respond to voice commands. Speech recognition apparatuses generally include an acoustic processor and stores a set of acoustic models. The acoustic processor measures sound features of an utterance by an external user and the acoustic model represents the acoustic features of the utterance of one or more words associated with the model. The sound features of the utterance are then compared to each acoustic model to produce a match score. The match score for an utterance and an acoustic model is an estimate of the closeness of the sound features of the actual utterance to the acoustic model.
The word or words associated with the acoustic model having the best match score may then be selected as the recognition results. Alternatively, the acoustic match score may be combined with other match scores, such as additional acoustic match scores and language model match scores. The word or words associated with the acoustic model or models having the best combined match score may then be selected as the recognition result.
For command and control applications, the speech recognition apparatus typically implemented recognizes an uttered command. Subsequently, the computer system executes a command to perform a function associated with the recognized command. For this purpose, the command associated with the acoustic model having the best match score is selected as the recognition result. Problems associated with inadvertent sounds such as coughs, size, or spoken words not intended for recognition have been alleviated by the invention disclosed within U.S. Pat. No. 5,465,317, which is hereby incorporated by reference herein.
As problems associated with recognizing spoken commands in a computer system have been reduced, or even alleviated, computer systems have taken advantage of this new and easily implemented interface to perform certain functions within the computer system. For example, International Business Machines Corporation has developed a VoiceType.RTM. V product which effectively allows a user to speak commands and have a computer system respond in a limited manner. Specifically, a user may specify a program or application to be accessed using voice commands. A speech recognition unit, such as those previously described, is then used to detect when an external user utters the specified phrases. Upon determining that an uttered word corresponds to the list specified by a user, the computer system implementing this interface accesses a program in response to a user's utterance of a word which is specified on the previously programmed static window list described above. Thus, such systems effectively implement a user interface in which program control may be specified by a voice command when the program has previously been stored in a specified list by the user.
While the aforementioned voice command interface works well, a user is limited by a manner in which voice commands may be issued. For example, because the program list is programmed and static, the user is only able to access the programs designated therein by a voice command. Therefore, the user is limited in the programs which are accessible by a voice command and is only able to access a program previously stored in the program list. To access additional programs using a voice command, a user is required to update a list of programs which are accessible by a voice command.
Furthermore, current implementations of voice command interfaces for computer systems require an external user to speak a full title precisely and accurately to access a computer application corresponding to that title. In situations in which different views of an application are provided or different components of the application are to be accessed, a user may encounter difficulty as the voice command interface becomes less intuitive and, therefore, a less effective interface for the user. For example, when a user desires to view a desktop, a user must specify whether it is Desktop-Icon View, Desktop Tree View, or Desktop-Properties, among others. In this situation, a user is required to speak the full title to access an application using voice command. Therefore, if a user simply spoke "a desktop," the speech recognition unit corresponding to the computer system would not find a match because the full label had not been spoken. In this case, a user is limited by a computer systems requirements that stilted, lengthy language be input to be recognized by the voice command interface.
Therefore, a need exists for a voice command interface which allows a user to communicate in a more natural and intuitive manner to perform desired actions on a data processing system.