The present invention relates to a speech recognition system for recognizing a word which has a plurality of syllables or polysyllable, as well as a single syllable word or monosyllable, with high precision, using a combination of patterh matching method and structure analysis method.
Pattern matching has been conventionally and frequently used for speech recognition. A time series of a spectrum envelope of an input speech is extracted as a feature parameter, and the obtained spectrum envelope is formed as an input speech pattern. The obtained speech pattern is compared with a reference speech pattern which is prestored in a dictionary memory, so as to compute a degree of matching thereof, that is, the similarity therebetween. Data of a category which includes the reference speech pattern having a maximum similarity to the measured speech pattern is outputted as a recognized result of the input speech. When the absolute value of the maximum value of the obtained similarity exceeds a predetermined threshold value, the input speech is regarded as recognized. However, when the absolute value is less than the predetermined threshold value, the input speech is regarded as unrecognized. When more than one similarity measure is obtained which exceeds the predetermined threshold value and the differences of which are small, a category corresponding to the largest value among the values of the similarity measures is determined to correspond to the input speech, thus often resulting in erroneous recognition. In other words, it is very difficult to perform speech recognition with high precision when monosyllables and polysyllables of similar speech patterns are involved.