(a) Field of the Invention
The present invention relates to a voice rule-synthesizer and a compressed voice-element data generator and, more particularly, to techniques for synthesis of voice waveform by rule based on compressed voice-element and for generation of compressed voice-element data for use in the synthesis.
The present invention also relates to a method for synthesizing a voice waveform by using a plurality of original voice data.
(b) Description of the Related Art
A waveform edition scheme is generally used for synthesis of voice waveforms by rule, i.e., for voice rule-synthesis. In this scheme, although a high voice quality is obtained with relative ease compared to other techniques, there is a problem in that a storage capacity used for storing voice elements, called original waveforms, is large because a large amount of original waveforms should be stored for creating different synthesized voice waveforms therefrom. The large storage capacity raises the cost for the voice synthesis by rule.
In order to solve the problem of the large storage capacity, conventional techniques attempt to use a compression scheme for compressing the voice elements. Patent Publication JP-A-8-160991, for example, describes such a technique, wherein a difference between adjacent pitches is stored instead of the voice element in a memory for reducing the storage capacity.
Patent Publication JP-A-5-73100 describes a technique wherein a vector quantization is conducted only for spectrum information to create compressed parameter patterns, which are stored in a code book.
In the conventional techniques as described above, it is difficult to compress the voice element with a higher degree of compression factor while suppressing degradation of the voice quality. In particular, since the voice elements used for voice synthesis are generally collected from a plurality of separate voice data, there exist a large number of short voice data sections corresponding to the separate voice data. The short voice data section generally involves a large compression distortion especially in the vicinity of the start point of the voice data section if a large compression factor is used. This raises the overall distortion of the resultant synthesized voices including a large number of voice data sections, and degrades the voice quality of the synthesized voices.