The present invention relates to an improvement in a pattern matching apparatus which constitutes an essential part of a character or voice recognition system.
Although the pattern matching system of the invention can be broadly applied to time sequence pattern matching, the following description of the invention will be focussed on the speech pattern by way of example.
Usually, the speech pattern can be expressed as a time sequence pattern of features. The speech recognition, therefore, is performed by comparing this time sequence pattern with a reference pattern, i.e., through a pattern matching. For a speech recognition at performing high accuracy, it is necessary to adopt a pattern matching method which is stable against the fluctuation of pattern, i.e., has a high capacity for adjusting to pattern fluctuation.
To overcome expansion and compression distortion in the time direction, a method called DP (Dynamic Programming) matching method, based on DP (dynamic programming process) proposed by the inventors, has been adopted to a satisfactory result. The details of the DP matching method are described in detail in, for example, U.S. Pat. No. 3,816,722 and a paper by HIROAKI SAKOE and SEIBI CHIBA entitled "Dynamic Programming Algorithm Optimization for Spoken Word Recognition", IEEE TRANSACTIONS ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL. ASSP-26, NO. 1, FEBRUARY 1978, pp. 43 to 49.
Actually, however, the speech pattern includes various deformations in addition to the time expansion and compression distortion. For instance, partial deformation of vowels is often found in a continuously uttered speech. Examples of such deformations are: nasalization in which, for example, a term "neck" [nek] is pronounced as [nek]; unvoicing in which, for example, a term "six" [siks] is pronounced as [sik]; omission in which, for example, a term piston [pistan] is pronounced as [pistn]; and elongation of a vowel. These deformations do not appear steadily but occur in quite an unpredictable manner depending on the speed of pronounciation and the phoneme at the preceding or succeeding side of the vowel.
Hitherto, in order to cope with the deformations of pronounciation, it has been necessary to prepare a plurality of reference patterns corresponding to various possible deformations. Especially, for unselected speakers, a great number of the reference patterns should be prepared. Therefore, it is necessary that the reference pattern itself have a large flexibility and that the matching process for comparing the input pattern with the reference pattern be made at a high efficiency. These are necessary also from the view point of reducing the capacity of the reference pattern memory.