This invention relates to a system for automatically recognizing continuous speech composed of continuously spoken words.
Voice recognition systems have been much in demand as input devices for putting data and programs into electronic computers and practical systems for automatically recognizing a speech are expected.
As is known in the art, it is possible to present a voice pattern with a sequence of Q-dimensional feature vectors. In a conventional voice recognition system, such as the one described in the P. Denes and M. V. Mathews article entitled "Spoken Digit Recognition Using Time-frequency Pattern Matching" (The Journal of Acoustical Society of America, Vol. 32, No. 11, November 1960) and another article by H. A. Elder entitled "On the Feasibility of Voice Input to an On Line Computer Processing System" (Communication of ACM, Vol. 13, No. 6, June 1970), the pattern matching is applied to the corresponding feature vectors of a reference pattern and of a pattern to be recognized. More particularly, the similarity measure between these patterns is calculated based on the total sum of the quantities representative of the similarity between the respective feature vectors appearing at the corresponding positions in the respective sequences. It is, therefore, impossible to achieve a reliable result of recognition in those cases where the positions of the feature vectors in one sequence vary relative to the positions of the corresponding feature vectors in another sequence. For example, the speed of utterance of a word often varies as much as 30 percent in practice. The speed variation results in a poor similarity measure even between the voice patterns for the same word spoken by the same person. Furthermore, for a conventional voice recognition system, a series of words must be uttered word by word thereby inconveniencing the speaking person and reducing the speed of utterance.
In order to recognize continuous speech composed of continuously spoken words, each voice pattern for a word must separately be recognized. A proposal to meet this demand has been made in the U.S. Pat. No. 3,816,722 entitled "COMPUTER FOR CALCULATING THE SIMILARITY BETWEEN PATTERNS AND PATTERN RECOGNITION SYSTEM COMPRISING THE SIMILARITY COMPUTER" filed jointly by the inventor of this case. In this system, continuous speech is separated word by word by using the dynamic programming. However, separation of continuous speech into words (segmentation) is not yet well established.