Speech waveforms of natural speech can be expressed by connecting basic units which are made by continuously connecting phonemes, one vowel (V) and one consonant (C) in a form such as “CV”, “CVC” or “VCV”.
Accordingly, a conversation can be created by means of synthetic speech by processing and registering such phonemes as data (phoneme data) in advance, reading out phoneme data corresponding to a conversation from the registered phoneme data in sequence, and generating sounds corresponding to respective read-out phoneme data.
To create a database based on the above-mentioned phoneme data, firstly, a given document is read by a person, and his/her speech is recorded. Then, speech signals reproduced from the recorded speech are divided into the above-mentioned phonemes. Various data indicative of these phonemes are registered as phoneme data. Then, in order to synthesize the speech, respective speech data is connected and supplied as a serial speech.
However, respective connected phonemes are segmented from the separately recorded speeches. Hence, irregularities exist in the vocal power with which the phonemes are uttered. Therefore, a problem arises that synthesized speech is unnatural when the uttered phonemes are merely connected together.
An object of the present invention is to provide a speech synthesizing method for generating natural sounding synthetic speech.