1. Field of the Invention
The present invention relates to a voice synthesizing system for synthesizing a voice by editing waveform, a segment generation apparatus for generating information necessary for voice synthesis, a voice synthesizing method and a storage medium storing a program for implementing the voice synthesizing method.
2. Description of the Related Art
There has been known a waveform concatenation system as a method for voice synthesis by rule.
The waveform concatenation system is a system for obtaining synthesized voice by extracting large amount of voice waveform segments in a pitch length, syllable length or so forth from a natural voice, storing the voice waveform segments in a storage device together with information of a phonemic environment, pitch shape in phonemes, amplitude, continuing period and so forth, and reading out optimal voice waveform segments according to rhythmic information or phonemic information set by synthesizing rule for obtaining a synthesized voice by connecting the read out voice waveform segments.
In the waveform concatenation system, while high quality synthesized voice can be obtained easily, it encounters a problem in that a large amount of voice waveform segments for generating synthesized voice have to be stored to make file size of the voice waveform segments excessively large. Particularly, this is significant when the voice waveform segments are extracted per pitch unit from voiced sound, in which the voice waveform segment thus extracted will be referred hereinafter as “pitch segment”).
As an approach for this problem, attempt has been made to store the voice waveform segments in compressed form in a storage device and read out them in decompressed form from the storage device. However, calculation amount to decompress the compressed voice waveform segments becomes large.
In the conventional voice synthesizing system, since the voice waveform segments to be used are decompressed individually upon voice synthesis respectively, calculation amount therefor becomes large. Particularly, increasing of calculation amount becomes significant at higher pitch frequency of the synthesized voice.
On the other hand, while the voice segment database included in the conventional voice synthesizing system may have the voice segment database smaller than that should be by compressing respective voice waveform segments, further smaller size of data base has been required in some applications. However, the conventional voice synthesizing system cannot satisfy such demand.