This invention relates to a voice recognition system applicable, for example, to a Japanese language voice input device adapted to recognize input voice signals in units of syllables.
Voice signals can be recognized relatively accurately if each sound is pronounced separately. In the case of a continuous voice signal, syllables are strongly influenced by their neighbors and vary greatly both in strength and pitch, depending on the position in a word, a phrase or a sentence. As a result, it is difficult to accurately analyze continuously delivered voice signals because the characteristic pattern of each syllable varies significantly, depending on the context and other factors.
In view of this problem caused by the phonological variations of syllables, attempts have been made, with voice recognition systems applicable, for example, to a Japanese language voice input device, not only to provide each syllable with a plurality of standard characteristic patterns but also to replace patterns with inferior recognition records with new patterns.
With such a system, average accuracy of recognition can be improved because standard characteristic patterns which are registered depend strongly on the frequency at which the corresponding syllable appears. In a phrase or a sentence which seldom appears, however, accuracy usually drops with such a system. This is because the number of standard patterns for each syllable is not specified and as the system keeps "learning" according to the frequency at which each pattern appears, the numbers of individual patterns for each syllable become unevenly distributed.
Another disadvantage of prior art voice recognition systems has been that they could not handle the situation where characteristic patterns belonging to the same category cease to match due to a change in voice signal waveform caused by a change in the speaker's sound quality or in the sound pickup system.