1. Field of the Invention
The present invention relates to improvement in the dynamic programing.
2. Description of the Prior Art
Generally, even if the same person vocalizes the same word, its length will vary each time and moreover it will be expanded or contracted in non-linear relation to the time axis. In brief, the length of a vocalized word involves irregular allowable distortions with respect to the time axis. This requires the time axis to be expanded or contracted so that .like phonemes correspond to each other between the standard pattern and the characteristic pattern of an input voice. As a specific method to do this, the dynamic programing (DP) can be used. The DP matching is a method to do time expansion/contraction matching between a characteristic pattern and a standard pattern, an important method in speech recognition.
Recently the inventor and others have proposed a speaker adaptation method in which the DP matching is applied to treat characteristic pattern variation in voice signals due to the difference among individuals (Nakagawa, Kamiya, and Sakai "Recognition of word voices by unspecified speakers depending on simultaneous non-linear expansion and contraction of time-, frequency-, and strength-axes of audio spectrum," Journal of the Institute of Electronics and Communication Engineers of Japan, Vol. J64-D No. 2, Feb., 1981), the effectiveness of which was recognized experimentally.
The above-mentioned speaker adaptation method is a method in which the DP is used for frequency expansion/contraction matching based on the fact that the characteristic pattern variation due to the difference among individuals is primarily irregular allowable distortion with respect to the frequency axis. More specifically, when a simple vowel /a/ is uttered as a key word, the spectrum in the steady-state portion of the vowel /a/ is compared with the counterpart of the same vowel /a/ by a standard speaker by means of the DP matching on the frequency axis. Then the direction of shift on the frequency axis of the spectrum of the vowel /a/ of the input speaker from that of the standard speaker is detected, and the detection result, the direction of shift on the frequency axis of the spectrum of the simple vowel /a/ is utilized for speaker adaption in word recognition.
However, the above speaker adaptation method has a problem that when it is attempted to normalize not only the direction of shift on the frequency axis of the spectrum of the simple vowel /a/ but also the degree of the shift, even the difference in phonemes as well as the difference among individuals are unexpectedly normalized, with the result that there arises a case in which the word cannot be recognized even though the difference among individuals can be eliminated.