One of the coding models widely applied in the speech coding field is Code Excited Linear Prediction (CELP). The CELP model uses an almost white excitation signal to excite two time-varying linear recursive filters. The excitation signal is generally selected out of a codebook composed of Gaussian white noise sequences. The feedback loop of each filter includes a predictor. One of the predictors is a long-term predictor (or a pitch predictor), which is represented by P(z). P(z) is used to generate the tone structure of a voiced speech (for example, the fine structure of a spectrum). Another common predictor is a short-term predictor, represented by F(z). F(z) is used to recover the short-term spectrum envelope of a speech. This model derives from its reverse process. That is, F(z) is used to remove the redundancy of a near sample point of the speech signal, and P(z) is used to remove the redundancy of a far sample point of the speech signal. A normalized residual signal is obtained through two levels of prediction. The residual signals take on standard normal distribution approximately.
When the CELP model is applied to the lossy compression field, the speech signal x(i) undergoes Linear Predictive Coding (LPC) analysis first to obtain the LPC residual signal res(i). After the LPC residual signal res(i) is framed, each subframe signal undergoes Long-Term Prediction (LTP) analysis to obtain the corresponding adaptive codebook and adaptive codebook gain. The adaptive codebook may be searched out in many methods such as autocorrelation. After the long-term dependence of the LPC residual signal res(i) is removed, the LTP residual signal x2(i) is obtained. After an algebraic codebook is used to characterize or fit the LTP residual signal x2(i), the whole coding process is completed. Finally, the adaptive codebook and the fixed codebook are coded and written into the bit stream, and joint vector quantization or scalar quantization is performed for the adaptive codebook gain and the fixed codebook gain. In the codebook, either the adaptive codebook gain or the fixed codebook gain is selected as the best gain. The index corresponding to the best gain is transmitted to the decoder. The whole coding process takes place in a Pulse Code Modulation (PCM) domain.
In the lossless compression field, a Moving Pictures Experts Group Audio Lossless Coding (MPEG ALS) apparatus also uses the short-term and long-term dependence of speech signals for prediction. Its prediction process is: First, LPC is performed for a speech signal, and the LPC coefficient undergoes entropy coding and is written into a bit stream; LTP is performed for the LPC residual signal to obtain the pitch and the pitch gain of the LTP, and the LPC residual signal is written into the bit stream; after the LTP, the LTP residual signal is obtained; and then the LTP residual signal undergoes entropy coding and is written into the bit stream, and the whole coding process is ended.
In the prior art described above, when the speech signal is less periodic, the LTP processing almost makes no contribution. In this case, the LTP residual signal is still written into the bit stream. Consequently, the pitch gain quantization consumes too many bits, and the compression performance of the coder is reduced.