The present invention relates to a speech synthesizer, in particular, relates to a pitch frequency control system in a speech synthesizer, having an accent and intonation (or phrase) arbitrarily adjustable for synthesizing smooth and natural synthesized speech.
Speech is synthesized by using speech parameters, including formant frequencies, formant bandwidths, voice source amplitude and pitch frequency.
In a conventional speech synthesis system, pitch frequency in each syllable is defined by the pitch frequency at a particular time point in the syllable. Also, the pitch frequency between those particular time point is calculated with an interpolation calculation between two adjacent pitch frequencies.
However, the above prior art has the disadvantage that the accent of each word is not adjustable because the accent component of each word is not separated from a phrase component or an intonation.
Another prior art which overcomes the above disadvantage is shown in "Analysis of Voice Fundamental Frequency Contours for Declarative Sentences of Japanese" by Hiroya Fujisaki, et al, in J.Acoust. Soc. Jpn (E) 5,4 (1984), pages 233-242, which can adjust a rapid accent component, and a slow phrase component, independently from each other. So, it becomes possible to provide a desired accent level and a desired phrase level.
However, said system by Fujisaki has the disadvantage of having the calculation for pitch frequency being too complicated for most usable sized hardware, since it must perform time consuming complicated exponential calculations for providing the pitch frequency at a particular instant.