Code excited linear predictive coding (CELP) is a well-known technique. This coding technique synthesizes speech by utilizing encoded excitation information to excite a linear predictive (LPC) filter. This excitation is found by searching through a table of candidate excitation vectors on a frame-by-frame basis.
LPC analysis is performed on the input speech to determine the LPC filter. The analysis proceeds by comparing the outputs of the LPC filter when it is excited by the various candidate vectors from the table or codebook. The best candidate is chosen based on how well its corresponding synthesized output matches the input speech. After the best match has been found, information specifying the best codebook entry and the filter are transmitted to the synthesizer. The synthesizer has a similar codebook and accesses the appropriate entry in that codebook, using it to excite the same LPC filter.
The codebook is made up of vectors whose components are consecutive excitation samples. Each vector contains the same number of excitation samples as there are speech samples in a frame. The vectors can be constructed in one of two ways. In the first method, disjoint sets of samples are used to define the vectors. In the second method, the overlapping codebook, the vectors are defined by shifting a window along a linear array of excitation samples.
The excitation samples used in the vectors in the CELP codebook can come from a number of possible sources. One particular example is Stochastically Excited Linear Prediction (SELP) method, which uses white noise, or random numbers, as the samples. Another method is to use an adaptive codebook. In such a scheme, the synthetic excitation determined for the present frame is used to update the codebook for future frames. This procedure allows the excitation codebook to adapt to the speech.
A problem with the CELP techniques for coding speech is that each excitation set of information in the codebook must be used to excite the LPC filter and then the excitation results must be compared utilizing an error criterion. Normally, the error criterion used is to determine the sum of the squared difference between the original and the synthesized speech samples resulting from the excitation information for each set of information. These calculations involve the convolution of each set of excitation information stored in the codebook with the LPC filter. The calculations are performed by using vector and matrix operations of the excitation information and the LPC filter. The problem is the large number of calculations, approximately 500 million multiply-add operations per second for a 4.8 Kbps vocoder, that must be performed.