1. Field of the Invention
The present invention relates to a compression apparatus for compressing waveform dictionary data composed of speech waveform data used for speech synthesis to create a compressed dictionary, and an expansion apparatus for expanding compressed data of the compressed dictionary.
2. Description of the Related Art
Due to the recent rapid development of computer technology, speech synthesis technology, of which use has conventionally been limited to the particular field, is becoming applicable to various fields. Along with this, various applications using speech synthesis are being actively developed.
In order to facilitate the use of an application using speech synthesis, it is required to realize high quality speech synthesis. This requires that a large amount of sound waveform data that is a relatively large capacity of data should be prepared. Thus, efficient compression/expansion of a large capacity of waveform data is important from a technical point of view.
For example, in order to compress sound waveform data, various procedures, such as μ-law, ADPCM, and CELP (in an increasing order of a compression ratio) have been considered. In general, as a compression ratio is increased, sound quality tends to degrade.
FIG. 1 shows a diagram illustrating the principle of a compression/expansion apparatus that has been conventionally used. In FIG. 1, reference numeral 11 denotes a waveform data input part, 12 denotes a waveform data compression/storage part, 13 denotes a waveform dictionary, 14 denotes a text data input part, 15 denotes a waveform dictionary reference/extraction part, 16 denotes a waveform data expansion part, and 17 denotes a synthesized speech output part.
In FIG. 1, only waveform data is a target for compression/expansion. Thus, waveform data is input from the waveform data input part 11, and the input waveform data is compressed in the waveform data compression/storage part 12, and stored in the waveform dictionary 13 as compressed waveform data.
Text data is input from the text data input part 14. The waveform dictionary 13 is referred to in the waveform dictionary reference/extraction part 15, and compressed waveform data matched with the text data is extracted. The extracted waveform data is expanded in the waveform data expansion part 16 during synthesis and reproduction of speech, and reproduced in the synthesized speech output part 17.
However, according to the above-mentioned compression/expansion method, higher quality waveform data with a higher compression ratio consumes a larger amount of computer resources during expansion, which takes a considerable amount of time only for expansion. This makes it impossible to conduct speech synthesis in real time.
Furthermore, some compression apparatuses cannot compress speech on a phoneme basis, and can generate compressed waveform data only on a syllable and sentence basis. Therefore, in the case where waveform data required for speech synthesis is the one smaller than a compression unit of waveform data, it is also required to expand an unwanted portion for speech synthesis. This takes a time longer than necessary for expansion.