There has long been a desire to have machines capable of responding to human speech, such as machines capable of obeying human commands and machines capable of transcribing human dictation. Such machines would greatly increase the speed and ease with which people communicate with computers and with which they record and organize their words and thoughts.
Due to recent advances in computer technology and speech recognition algorithms, speech recognition machines have begun to appear in the past several decades, and have become increasingly more powerful and less expensive. For example, the assignee of the present application has publicly demonstrated speech recognition software which runs on popular personal computers and which requires little extra hardware. This system is capable of providing speaker dependent, discrete word recognition for vocabularies of up to two thousand words at any one time, and many of its features are described in U.S. patent application Ser. No. 797,249, entitled "Speech Recognition Apparatus and Method", which is assigned to the assignee of the present application, and which is incorporated herein by reference.
Advances of the type currently being made by the assignee of the present application and other leaders in the speech recognition field will make it possible to bring large vocabulary speech recognition systems to the market. Such systems will be able to recognized a large majority of the words which are used in normal dictation, and thus they will be well suited for the automatic transcription of such dictation. But to achieve their full usefulness such voice recognition dictation machines should be able to perform editing functions as well as the straight forward entry of text, and they should be able to perform such functions in a way that is easy and natural for the user.
Voice recognition has been used as a way of controlling computer programs in the past. But current voice recognition systems are usually far from foolproof, and the likelihood of their misrecognizing a word tends to increase as does the size of the vocabulary against which utterances to be recognized are compared. For this reason, and to reduce the amount of computation required for recognition, many speech recognition systems operate with precompiled artificial grammars. Such an artificial grammer associates a separate sub-vocabulary with each of a plurality of grammer states; provides rules for determining which grammer states the system is currently in, and allows only words from the sub-vocabulary associated with the current machine state to be recognized.
Such precompiled artificial grammars are not suitable for normal dictation, because they do not allow users the freedom of word choice required for normal dictation. But such artificial grammars can be used for commanding many computer programs which only allow the user to enter a limited number of previously known commands at any one time. There are, however, many computer command for which such precompiled artificial grammars are not applicable because they allow the user to enter words which are not limited to a small predefined vocabulary. For example, computer systems commonly refer to, or perform functions on, data contained in changeable data structures of various types, such as text files, data base files, file directories, tables of data in memory, or menus of choices currently available to a user. Artificial grammars are often insufficient for computer commands which name an element contained in such a data structure, because the vocabulary required to name the elements in such data structures is often not known in advance.