This invention relates to a formant pattern matching vocoder for analyzing and synthesizing an input speeh signal by pattern matching making use of formant information.
A pattern matching vocoder is well known as an effective means for compressing speech information to be transmitted. In a pattern matching vocoder, the most similar reference pattern is selected by matching (comparing) the spectrum envelope of the input speech with those of reference patterns previously registered, and a label indicating the most similar reference pattern is transmitted from the analysis side to the synthesis side. Usually, an .alpha. parameter or a K parameter of an LPC (Linear Prediction Coding) coefficient or various coefficients derived therefrom are utilitzed as the information representing the spectrum envelope.
This pattern matching vocoder, however, is disadvantageous in that the number of speakers available for the traning or registration for making the reference patterns is limited due to economic and other reasons. It is very difficult to make reference patterns suitable for any person, by clustering the training data obtained from the limited number of speakers. This indicates that making the reference patterns applicable to all unspecified speakers is impossible on the basis of the data spoken by even several tens of speakers. The difference in the spectral distribution for the speakers is attributed to the fact that each speaker has his own vocal tract characteristics and vocal cord sound source characteristics. The difference in the vocal tract characteristics, which is caused by the difference in the length of the vocal tract, causes a change in the formant frequency which is a point of resonance in the vocal tract. On the other hand, the difference in the vocal cord sound source characteristics causes a change in the gradient of the spectrum envelope. In order to perform a pattern matching suitable for any person, therefore, it is necessary to normalize the vocal tract characteristics and the vocal cord sound source characteristics by suitable measures, or to eliminate their influences.
In conventional pattern matching vocoder, however, the pattern matching is conducted through spectral envelope parameter(s) extracted by LPC analysis. The LPC analysis is based on the extraction of the spectral envelope paramenter on the assumption that the vocal tract characteristics, which are not actually flat, are flat. Namely, the spectral envelope parameter is extracted as a convolution of the vocal tract characteristics which vary depending on the speaker, and vocal cord sound source characteristics which are regarede as being flat.
In order to effect a pattern matching which can easily be adapted to any person, it is necessary to separate the voice tract characteristics and the voice cord sound source characteristics from each other and to form a spectral distribution through normalizing both characteristics or by eliminating the speaker-dependencies of these characteristics. Unfortunately, however, this fact has not been taken into consideration in the design of conventional pattern matching systems.