This invention relates to a distributed processing apparatus and a method for transferring data among a plurality of processors within the apparatus. More particularly, the invention relates to the use of the apparatus and method in continuous speech recognition in real time using a vocabulary of substantial size.
Automatic speech recognition systems provide a means for man to interface with computers and other machines in a human's most natural and convenient mode of communication. Where required, this will enable operators of such computers and machines to enter data, request information and control systems when their hands and eyes are busy, when they are in the dark, or when they are unable to be stationary at a terminal. Also, machines using normal voice input require much less user training than do systems relying on complex keyboards, switches, push buttons and other mechanical devices.
One known approach to automatic speech recognition of isolated words involves the following: periodically sampling a bandpass filtered (BPF) audio speech input signal to create frames of data and then preprocessing the data to convert them to processed frames of parametric values which are more suitable for speech processing; storing a plurality of templates (each template is a plurality of previously created processed frames of parametric values representing a word, which when taken together form the reference vocabulary of the automatic speech recognizer); and comparing the processed frames of speech with the templates in accordance with a predetermined algorithm, such as the dynamic programming altorithm (DAP) described in an article by F. Itakura, entitled "Minimum prediction residual principle applied to speech recognition", IEEE Trans. Acoustics, Speech and Signal Processing, Vol. ASSP-23, pp. 67-72, February 1975, to find the best time alignment path or match between a given template and the spoken word.
Isolated word recognizers such as those outlined above require the user to artificially pause between every input word or phrase. This requirement is often too restrictive in a high workload and often stressful environment. Such an environment demands the very natural mode of continuous speech input. However, problems of identifying word boundaries in continuous speech recognition, along with larger vocabulary demands and the requirement of syntax control processing to identify only predefined meaningful phrases and sentences, requires added and more complex processing.
It is desirable, therefore, to provide the additional processing requirements with a small, low cost apparatus and method which is readily adaptable to growth to accommodate increased vocabulary and syntax demands while at the same time providing reliable and near real time processing.