This invention relates to a partial auto correlation type speech synthesizer in which voice waveforms are analyzed to extract characteristic parameters, the characteristic parameters thus extracted are transferred to memory means at a given rate (hereinafter referred to as "a frame period"), and with the air of digital filter, voice waveforms are synthesized and outputted according to the characteristic parameters.
Most speech synthesizers which are practically used are of the partial auto correlation type. Circuits for synthesizing the voice waveforms are integrated on one silicon chip. Such a speech synthesizer is, in general, obtained by integrating function circuits 100 on the synthesis side of an analysis and synthesis system as shown in FIG. 1.
In FIG. 1, reference numeral 300 designates a parameter file which is adapted to store characteristic parameters of voices which have been analyzed and extracted by an analyzer 200.
The speech synthesizer comprises essential components which are arranged as shown in the block diagram of FIG. 2. More specifically, the speech synthesizer comprises decoders 110, 120 and 130 for decoding the pitch, voiced/unvoiced discrimination code, the amplitude and the partial auto correlation coefficients (so-called K parameters) of the characteristic data D which is extracted from a voice waveform and is quantized by the analyzer 200 in FIG. 1; memories 111, 121 and 131 for temporarily storing the parameters thus decoded, respectively; a pulse generating circuit 112 for producing a train of pulses corresponding to the value of the pitch parameter output by the memory 111; a white noise generating circuit 113 for generating white noise which is used as a exciting signal for unvoiced sound; a exciting signal selecting circuit 114 for selecting either the pulse train or the white noise signal as a exciting signal according to the voiced or unvoiced discrimination code; an amplitude multiplication circuit 140 for multiplying a exciting signal by the content of the amplitude memory 121; a digital filter 150 for extracting a predetermined frequency spectrum component from the exciting signal using a filter coefficient corresponding to the content of the K parameter memory 131; and a D/A converter 160 for converting a digital value provided by the digital filter 150 into an analog signal.
The speech synthesizer further comprises a timing signal generating circuit (not shown) for operating the various above-described circuit elements with suitable timing; and an interface circuit (not shown) for sequentially loading the time-series data, which are obtained by voice analysis and are stored in external memories, in the decoders 110, 120 and 130.
In such a speech synthesizer, the analysis data is subjected to compression, in order to more economically use the memory which stores the voice data. Even when a one second voice interval is compressed to the extent of about 2000 bits, the clarity is maintained substantially unchanged; that is, the method is practical. There is a variety of known voice compressing methods. In one example, the amplitude parameter is assigned 4 to 6 bits, the pitch parameter is assigned 5 to 6 bits, and in the case of the K parameters, K.sub.1 through K.sub.10 are assigned to 5, 5, 4, 4, 4, 4, 4, 3, 3 and 3 bits, or 7, 5, 4, 4, 4, 3, 3, 3, 3, and 3 bits in the stated order, in what is called a "non-uniform bit distribution".
The decoders 110, 120 and 130 in FIG. 2 operate to decode these quantized parameter codes into the true values of analysis data, thus forming tables having the numbers of words corresponding to the respective numbers of bits. Generally, because of a limitation in the formation of circuits, the digital value to be decoded has an accuracy of 10 bits.
The above-described speech synthesizer can provide quite a natural synthesized voice using a small voice data memory. However, the speech synthesizer cannot provide a musical tone of high quality such as a sinusoidal wave because of the spectral distortion due to quantitization, or because of a high modulation noise due to the unsatisfactory matching of the exciting signal frequency to the pole frequency of the digital filter.
The digital filter 150 is a multistage lattice-type filter which, as shown in FIG. 3, comprises an adder/subtractor 151, a multiplier 152 and a delay unit 153.