1. Field of the Invention
The present invention relates to a speech coding device advantageously applicable to a CELP (Code Excited Linear Prediction) coding system or an MPE (Multi-Pulse Excitation) linear prediction coding system.
2. Description of the Background Art
Today, an AbS (Analysis by Synthesis) system, e.g., a CELP coding system or an MPE linear prediction coding system is available for the low bit rate coding and decoding of speeches and predominant over the other systems. Generally, the problem with models for the study of speeches is that it is difficult, with many of them, to determine the value of a parameter for a given input speech by an analytical approach. The AbS system is one of solutions to such a problem and causes the parameter to vary in a certain range, actually synthesize speeches, and then selects one of the synthetic speeches having the smallest distance to an input speech. This kind of coding and decoding scheme is taught in, e.g., B. S. Atal "HIGH-QUALITY SPEECH AT LOW BIT RATES: MULTI-PULSE AND STOCHASTICALLY EXCITED LINEAR PREDICTIVE CODERS", Proc. ICASSP, pp. 1681-1684, 1986.
Briefly, the AbS system synthesizes speech signals in response to an input speech signal, and generates error signals representative of the differences between the synthetic speech signals and the input speech signal. Subsequently, the system computes square sums of the error signals, and then selects one of the synthetic speech signals having the smallest square sum. For the synthetic speech signals, a plurality of excitation signals prepared beforehand are used. For the excitation, the CELP system and MPE system use random Gaussian noise and a pulse sequence, respectively.
The problem with the AbS system is that the square sums of the error signals used for the evaluation of the excitation signals cannot render the synthetic speech signal sufficiently natural alone in the human auditory perception aspect. For example, an unnatural waveform absent in the original speech signal is apt to appear in the synthetic speech signal. Under these circumstances, there is an increasing demand for a speech coding device capable of producing, without deteriorating perceptual naturalness, a synthetic speech signal faithfully representing an input speech signal.