As a probabilistic technique for recognizing speeches, there is known a technique using Markov models. The speech recognition using Markov models employs probabilistic models, each having a plurality of states, transitions between the states, a probability of each of the transitions occurring, and a probability of each of the labels being output in each of the transitions. For example, such a probabilistic model is provided for each word and its probability parameters are established by training. At the time of speech recognition, a label string obtained from an unknown input speech is matched with each probabilistic model, and a word of a probabilistic model having the highest probability of the label string occurring, is determined as a recognition result. Such a technique is described, for example, in an article by F. Jelinek, "Continuous Speech Recognition by Statistical Methods," Proceedings of the IEEE, Vol. 64, 1976, pp. 532-556.
The speech recognition using Markov models, however, requires a great amount of training data for establishing the probability parameters by training and also a significant amount of calculating time for training.
Other techniques in the prior art include the following:
(1) Article by T. Kaneko, et. al., "Large Vocabulary Isolated Word Recognition with Linear and DP Matching," Proc. 1983 Spring Conference of Acoustical society of Japan, March 1983, pp. 151-152.
(2) Article by T. Kaneko, et. al., "A Hierarchical Decision Approach to Large Vocabulary Discrete Utterance Recognition," IEEE Trans. on ASSP. Vol. ASSP-31, No. 5, October 1983.
(3) Article by H. Fujisaki, et. al., "High-Speed Processing and Speaker Adaptation in Automatic Recognition of Spoken Words," Trans. of the Committee on Speech Recognition, The Acoustical Society of Japan, S80-19, June 1980, pp. 148-155.
(4) Article by D. K. Burton, et. Al., "A Generalization of Isolated Word Recognition Using Vector Quantization," ICASSP 83, pp. 1021-1024.
These articles disclose dividing a word into blocks along a time axis. However, they describe nothing about obtaining label output probabilities in each of the block and performing probabilistic speech recognition based on the label output probabilities in each of the blocks.