The invention relates generally to the field of speech recognition and in particular to the recognition of speech elements in continuous speech.
The need for speech recognition equipment which is reliable and reasonably priced is well documented in the technical literature. Speech recognition equipment generally falls in two main categories. One category is speaker independent equipment wherein the speech recognition apparatus is designed to recognize elements of speech from any person. However speaker independent systems can be quite limited with regard to features other than the "speaker independence", for example, the number of words in the recognition vocabulary. Also, typically, five to ten percent of the population will not be recognized by such systems. The other category, speaker dependent speech recognition, relates to speech recognition apparatus which are substantially trained to recognize speech elements of a limited class, and in particular the class consisting of one person. Within each category, the speech recognition apparatus can be directed to the recognition of either continuous speech, that is, speech wherein the boundaries of the speech elements are not defined, or to isolated speech, that is, speech in which the boundaries of the speech elements are a priori known. An inportant difference between continuous and isolated speech recognition is that in continuous speech, the equipment must make complex "decisions" regarding the beginnings and ends of the speech elements being received. For isolated speech, as noted above, the incoming audio signal is isolated or bounded by either a given protocol or other external means which makes the boundary decisions relatively simple.
There exist today many commercial systems for recognizing speech. These systems operate in either a speaker independent environment (as exemplified for example by U.S. Pat. Nos. 4,038,503; 4,227,176; 4,228,498; and 4,241,329 assigned to the assignee of this invention) or in the speaker dependent environment. In addition, the commercially available equipment variously operate in either an isolated speech or a continuous speech environment.
The commercially available equipment, however, is expensive when high recognition performance is required. This is often a result of the best equipment being built for the most difficult problem, that is, speaker independent, continuous speech recognition. Consequently, many of the otherwise available applications to which speech recognition equipment could be adapted have not been considered because of the price/performance relationship of the equipment. Furthermore, the commercially available equipment cannot easily be expanded to provide added capability at a later date, and/or does not have the required accuracy or speed when operating outside of the laboratory environment.
Primary objects of the present invention are therefore an accurate, reliable, reasonably priced, continuous speech recognition method and apparatus which can operate outside of the laboratory environment and which enable the user to quickly and easily establish an operating relationship therewith. Other objects of the invention are a method and apparatus generally directed to speaker dependent, continuous speech recognition, and which have a low false alarm rate, high structural uniformity, easy training to a speaker, and real time operation.