The present invention relates to a speech synthesis apparatus, which has a database for managing phonemic piece data and performs speech synthesis by using the phonemic piece data managed by the database, a control method for the apparatus, and a computer-readable memory.
As a conventional speech synthesis method, a synthesis method based on a waveform concatenation scheme is available. In the waveform concatenation synthesis method, the prosody is changed by the pitch synchronous waveform overlap adding method of pasting waveform element pieces corresponding to one or more pitches at desired pitch intervals. The waveform concatenation synthesis method can obtain more natural synthetic speech than a synthesis method based on a parametric scheme, but suffers the problem of a narrow allowable range with respect to changes in prosody.
Under the circumstances, attempts are made to improve the speech quality by preparing various speech data and properly selecting and using them. As a criterion for selection of speech data, information such as phonemic context (a phoneme to be synthesized or a few phonemes on two sides of the target phoneme) or fundamental frequency F0 is used.
The following problems are, however, posed in the above conventional speech synthesis method.
If, for example, there is no data that satisfies a phonemic context as a synthesis target, a search for necessary speech data is made again by relaxing the condition associated with the phonemic context. The execution of this re-search in speech synthesis complicates the processing, resulting in an increase in processing time. In addition, when the fundamental frequency F0 is to be used as a criterion for selection of speech data, each speech data must be evaluated in association with the fundamental frequency F0 to obtain speech data that matches most with the fundamental frequency F0 of the speech data to be synthesized.