The present invention relates to a speech recognition method utilizing the pattern matching method.
In the speech recognition method according to the pattern matching method, the speech information is generally recognized by matching the information of a spectrum obtained from input speech with standard patterns. On the other hand, it has been attempted to positively adopt the information of speech power to improve the recognition accuracy. Recently, a satisfactory result has been reported on a speech recognition of speech patterns made by unspecified talkers (Aikawa, K, et. al.: An Isolated Word Recognition Method Using Power-Weighted Spectral Matching Measure; Transactions of the Committee on Speech Research, Acoust. Soc. Jpn., S81-59 (1981)).
A problem encountered when information of speech power is used to recognize a speech pattern is the difficulty in comparing speech power by use of the absolute values thereof. To solve this problem, it is proposed to normalize the speech power by using the maximum and minimum values of the speech power in the input speech interval, which is also utilized in the method discribed in the above-mentioned report. In this case, relevant processing cannot be initiated until the end of the speech interval because the maximum and minimum values of speech power are needed, that is, the processing cannot be started is principle at the same time when a speech pattern in inputted. This causes the output of the processed recognition results to be delayed, and furthermore, a buffer memory is necessary to store information to be outputted afterward; thus, the size and cost of the speech recognition equipment will be increased.
On the other hand, a pattern matching method according to the dynamic programming (to be abreviated as DP hereinafter) method has been proposed. Especially, a continuous DP matching method has been disclosed as a realtime matching method suitable for continuous speech. (Refer to the Japan Patent Laid-open No. 55-2205 for details.) This method has a feature that the results obtained by matching the input speech with the relevant standard pattern are continuously outputted. However, since the matching results reflect only the evaluation of the average degree of similarity between the input speech and the standard patterns, a problem that the error of recognition therebetween is increased in principle arises for input words including a similar portion. To overcome this difficulty, the inventors of the present invention have proposed a method in which each of the standard patterns is subdivided into a plurality of partial standard patterns and each of these partial standard patterns is compared independently. (See for example, Japan Patent Laid-open No. 58-58598 dated Apr. 7, 1983, only for reference). According to this method, if an input speech is matched with standard patterns and partial patterns thereof under a predetermined condition, the input speech is assumed to fall into the same category as the standard pattern. In this method, however, since each of the standard patterns is matched independently of the partial standard patterns thereof, the standard pattern memory and the load imposed on the matching block are increased.