The invention relates generally to speech recognition, and more specifically, to the selection of one of multiple speech recognition engines based on a media type.
With the growth of speech recognition engine capabilities, there is a corresponding increase in the number of applications and uses for speech recognition. Different types of speech recognition applications and systems have been developed, based upon the location of the speech recognition engine with respect to the user. One such example is an embedded speech recognition engine, otherwise known as a local speech recognition engine, such as the SpeechToGo speech recognition engine, available from SpeechWorks International, Inc., 696 Atlantic Avenue, Boston, Mass. 02111. Another type of speech recognition engine is a network-based speech recognition engine, such as SpeechWorks 6, as sold by SpeechWorks International, Inc., 695 Atlantic Avenue, Boston, Mass. 02111.
Embedded or local speech recognition engines provide the added benefit of speed in recognizing a speech input, wherein a speech input includes any type of audible or audio-based input. A drawback of embedded speech or local speech recognition engines is that these engines typically contain a limited vocabulary. Due to memory limitations and system processing requirements, in conjunction with power consumption limitations, embedded or local speech recognition engines provide recognition to only a fraction of the audio inputs recognizable by a network-based speech recognition engine.
Network-based speech recognition engines provide the added benefit of an increased vocabulary, based on the elimination of memory and processing restrictions. Although a downside is the added latency between when a user provides a speech input and when the speech input may be recognized, and provided back to the user for confirmation of recognition. In a typical speech recognition system, the user provides the audio input and the audio input is thereupon provided to a server across a communication path, whereupon it may then be recognized.
A problem arises when multiple speech recognition engines are available for recognizing the speech input. While each speech recognition engine provides advantages and disadvantages, it is more efficient to be able to select one of the particular speech recognition engines. There currently exists the availability to choose between multiple speech recognition engines using a variety of factors, such as a user-based selection. Another selection may be made by the recognition of a particular term, which thereupon indicates that a secondary type of specific entry may be inputted, such as if the initial speech input is the word xe2x80x9cdialxe2x80x9d, a second speech recognition engine maybe selected based on having the availability to selectably recognize specific names or telephone book entries.