As computer systems become more pervasive in society, these systems' inability to effectively communicate with their users have also become more apparent. Firstly, users must learn archaic commands or non-intuitive procedures in order to accomplish their desired tasks. Secondly, users are constrained to use many of the conventional input devices such as mice or keyboards to input these commands. With the advancement in speech processing and related technologies, one proposed solution to ameliorate the mentioned inefficiency is a speech or voice recognition system.
A speech recognition system has the ability to audibly detect human speech, parse out that speech and generate a string of words, sounds or phonemes to represent the speech. The system also possesses the ability to translate the generated words, sounds or phonemes into corresponding machine commands and execute the commands.
Some speech recognition systems are available in the marketplace. For example, IBM's ViaVoice Gold, a desktop application available on a computer system, allows a user to speak to the computer system's input device and activates certain command-and-control menus/windows. Specifically, if the user is using a word processing program, the user can say, "File Command, Open", and the File Open window pops up. The user can then select an existing file from the File Open window with the computer system's mouse and then say, "Open" to the computer system's input device. In response to the "Open" command, the desired document appears.
Products such as ViaVoice Gold possess certain drawbacks: 1) these products often require a user's manual interaction with the system. In the example given above, before the user can issue the voice command, "Open", the user must first manually select the file with a mouse. 2) These products only support a single local user. Specifically, the user needs to be in front of or be in the vicinity of the computer systems in order to speak into the input devices of such systems. Additionally, these products are not capable of handling multiple speech applications. They are often designed to receive and process voice commands from one speech source. 3) These products are not designed to work with other vendors' products. As a result, if a user desires dictation functionality in one product and device control functionality in another, unless both products are manufactured by the same vendor, the user most likely will fail to obtain both functionality in a seamless fashion.
As has been demonstrated, an improved method and apparatus is needed to manage multiple speech applications.