The present invention relates to a second synthesizer with which it is possible to reconstruct a sound of substantially the same quality as an original sound from its features transmitted or stored in a memory in a small amount of information.
For example, in the case of reconstructing speech from feature parameters of original speech, according to the prior art the output of a pulse generator simulating the vibration of the vocal cord and the output of a noise generator simulating turbulence are changed over or mixed together depending on whether the speech is voiced or unvoiced and the resulting output is amplitude-modulated in accordance with the speech amplitude to produce an excitation source signal which is applied to a filter simulating the resonance characteristics of the vocal tract to obtain synthesized speech. A synthesis system using partial auto correlation (PARCOR) coefficients and a formant synthesis system are examples of such speech synthesis system employing the feature parameters. The former is set forth, for example, in J. D. Markel et al., "Linear Prediction of Speech", pages 92-128, Springer-Verlag, 1976, in which the partial auto correlation coefficients or the so-called PARCOR coefficients of a speech waveform are used as the feature parameters. If the absolute values of the PARCOR coefficients are all smaller than unity, the speech synthesizing filter is stable. The PARCOR coefficients may be relatively small in the amount of information for speech synthesis and the automatic extraction of the coefficients is relatively easy, but the individual parameters differ widely in the spectral sensitivity. Accordingly, if all the parameters are quantized using the same number of bits, spectral distortions caused by quantization errors for the respective parameters largely differ from each other. Further, the PARCOR coefficients are poor in their interpolation characteristics and, by the interpolation of the parameters, there are produced noises, resulting in an indistinct speech. Especially at a low bit rate, the speech quality is deteriorated by the spectral distortion and no satisfactory synthesized speech quality is obtainable. In addition, the PARCOR coefficients do not directly correspond to spectral properties such as formant frequencies, and hence the PARCOR coefficients are not suitable for speech synthesis by rule.
The formant synthesis system is disclosed, for example, in J. L. Flanagan, "Speech Analysis, Synthesis and Perception", pages 339-347, Springer-Verlag, 1972. This system is one which synthesizes speech using the formant frequencies and their intensity as parameters and which is advantageous in that the amount of information for the parameters may be small and in that the correspondence of the parameters to spectral quantities is easy to obtain. For the extraction of the formant frequency and the intensity thereof, however, it is necessary to make use of general dynamic characteristics and statistical properties of the parameters, and complete automatic extraction of the formant frequency and the intensity thereof is difficult. Accordingly, it is difficult to automatically obtain synthesized speech of high quality and it is likely to markedly degrade the quality of the synthesized speech by an error in the extraction of the parameters.
It is an object of the present invention to provide a sound synthesizer which is able to synthesize a sound of high quality using a small amount of information.
Another object of the present invention is to provide a sound synthesizer which permits relatively easy extraction of the feature parameters and operates stably and in which differences in the spectral sensitivity among the parameters are small and the quantization accuracy of the parameters is the same in the case of the same quantization bits.
Another object of the present invention is to provide a sound synthesizer which is excellent in interpolation characteristics for parameters used and hence is able to obtain a synthesized sound of high quality with a small amount of information.
Yet another object of the present invention is to provide a sound synthesizer which can be produced in a relatively simple structure.