The present invention relates to a speech synthesizer which synthesizes speech by combining voice source to a filter having desired characteristics. The present invention relates to such a system which synthesizes high quality of speech even when speech length and/or speech rate is adjusted.
Conventionally, a speech synthesizer stores a train of feature vectors including a plurality of formant frequencies and formant bandwidthes relating to each phoneme, and feature vector coefficients indicating change of phoneme between adjacent phonemes for every short period, for instance, 5 msec. And, an interpolation calculation has been used for obtaining transient data which are not stored between two phonemes. In that prior art, a steady state portion of a feature vector is shortened and/or elongated according to duration of each phoneme defined by a phoneme and speech rate, by omitting a data and/or repeating the same data.
However, a prior speech synthesizer has the disadvantage that synthesized speech is unnatural, because a transient portion of a phoneme is not modified even when speech rate changes.
A prior speech synthesizer has another disadvantage that the storage capacity required for storing speech data is too large, since it must store the data for every 5 msec.