1. Field of the Invention
The invention relates to a method and an apparatus for voice recognition in which a similar voice pattern is obtained by comparing an input voice pattern voice and a voice standard pattern.
2. Related Background Art
As a method of voice recognition, for example, like a continuous DP method, there has been used a word spotting method whereby a matching is executed while sequentially sliding a standard pattern for an input voice and a detection at a voice duration and a recognizing process are simultaneously executed on the basis of a distance as a result of the matching.
The word spotting method is a method whereby the matching process is performed while sliding word standard patterns each consisting of characteristics of a voice such as a spectrum or the like for the input voice on a frame unit basis and a duration at which it is presumed that the word exists is detected by a point at which a score as a calculation result of the matching process has the minimum value in each standard pattern, and, after that, the minimum values of the scores of all of the standard patterns are compared, thereby obtaining a recognition result.
According to the conventional word spotting method, however, there is a drawback in that, in the case where a phoneme train constructing a certain standard pattern perfectly coincides with a part of a phoneme train of another standard pattern having a larger number of phonemes, an erroneous recognition cannot be avoided in principle. For example, in the case where there is /roku/ in addition to /ku/ as a standard pattern, when the input voice is /roku/, the standard patterns /ku/ and /roku/ each perfectly coincide with a part or all of the input voice. Therefore, although the scores of both of the above patterns are higher than those of the remaining standard patterns as a result of the matching process, there is no clear difference between the scores shown by the respective patterns so long as the input voice is normally pronounced. The pattern /roku/ is set to the first order or /ku/ is set to the first order depending on a slight fluctuation of the input voice. That is, according to the conventional example, there is a drawback in that, even when the input voice is normally pronounced, an erroneous recognition cannot be avoided because of a defect of the word spotting method in principle.