The present invention relates to data processing systems, and more particularly, to enabling existing application programs for operation in speech recognition environments.
Computer users have always yelled at their machines, but now computers are beginning to listen. The tap-tap-tap of the electronic workplace is being joined by a cacophony of conversation. Users can tell their computers to open and close files or perform other tasks by speaking a few words. Telephone callers can tell their carriers"" computer systems to make a collect call or dial a business associate or suppliers.
Driving this move to listening computers is a one thousand per cent increase in microprocessor power, an accompanying price drop, and a new generation of voice-recognition devices. Another force bringing the power of voice to the desktop is the need to find an alternative input device for the keyboard and mouse. A logical replacement is a voice interface which allows a user to use a device available since birth. Speech recognition technology is available to the desktop user through the simple installation of a program and a microphone.
Typical prior art speech recognition operations occur in a single user, speech-dependent environment. This requires each speaker to train the speech recognizer with the user""s voice patterns, during a process called xe2x80x9cenrollmentxe2x80x9d. The system then maintains a profile for each speaker, who must identify themselves to the system in future recognition sessions. Typically speakers enroll via a local microphone in a low noise environment, speaking to the single machine on which the recognizer is resident. During the course of enrollment, the speaker is required to read a lengthy set of transcripts, so that the system can adjust itself to the peculiarities of each particular speaker. These systems require speakers to form each word in a halting and unnatural manner, pausing, between, each, and, every, word. This allows the speech recognizer to identify the voice pattern associated with each individual word using preceding, and following, silences to bound the words. The speech recognizer will typically have a single application for which it is trained.
More recently, a major advance occurred with the advent of speaker independent recognition systems that are capable of recognizing words from a continuous stream of conversational speech. This system requires no individualized speaker enrollment for effective use, unlike some speaker dependent systems which require speakers to be re-enrolled every four to six weeks, or require users carry a personalized plug-in cartridge to be understood by the system. With continuous speech recognition, no pauses between words are required, thus providing a more user friendly approach to the causal user of a speech recognition system. The growing familiarity and acceptance of speech has lead to more demand for speech aware applications. While applications have started to be designed for speech input, a large number of application programs have been written before this user friendly approach was available.
Consequently, it would be desirable to automatically enable existing application programs for operation in speech recognition environments without changing existing source code or recompiling the application programs.
This invention relates to a method and apparatus for enabling existing application programs for operation in speech recognition environments. Existing application programs written with a dynamically linked library or object library, with no speech recognition capability, are capable of accepting input from a speech recognition device without modification. This is accomplished by supplying an alternate dynamic library or object library that supports the same interface or objects used by the original program. The alternate library is written so that it is aware of and receives input from the speech recognition system. The alternate library then passes the input to the application program using the existing interfaces. The application program is unaware that the input comes from the speech recognition system instead of standard input devices such as keyboards or mouses.