Conventional devices for speech-signal coding, usually known in the art as "Vocoders", use a speech synthesis method providing the excitation of a synthesis filter, whose transfer function simulates the frequency behavior of the vocal tract with pulse trains at pitch frequency for voiced sounds or in the form of white noise for unvoiced sounds.
This excitation technique is not very accurate. In fact, the choice between pitch pulses and white noise is too stringent and introduces a high degree of degradation of reproduced-sound quality.
Besides, both the voiced-unvoiced sound decision and the pitch value are difficult to determine.
A method known for exciting the synthesis filter, intended to overcome the disadvantages above, is described in the paper by B. S. Atal and J. R. Remde, "A New Model of LPC Excitation for Producing Natural-Sounding Speech at Low Bit Rates, International Conference on ASSP, pp. 614-617, Paris 1982.
This method uses a multi-pulse excitation, i.e. an excitation consisting of a train of pulses whose amplitudes and positions in time are determined so as to minimize a perceptually-meaningful distortion measurement. The distortion measurement is obtained by a comparison between the synthesis filter output samples and the speech samples, and by weighting by a function which takes account of how human auditory perception evaluates the introduced distortion.
Nevertheless, this method cannot offer good reproduction quality at a bit-rate lower than 10 kbit/s. In addition excitation-pulse computing algorithms require an unsatisfactorily high number of computations.