Code excited linear prediction (CELP) encoding technology may be understood to be a medium-to-low-rate speech compression coding technology, which takes a codebook as an excitation source, and has advantages such as low rate, high quality of synthesized speech, and strong noise immunity, such that it can be widely applied as a mainstream coding technology at the coding rate of 4.8-16 kb/s. FIG. 1 is a systematic block diagram of a CELP speech encoding terminal, and FIG. 2 is a systematic block diagram of a CELP speech decoding technology. As shown in FIG. 1, an input speech signal may be preprocessed, and then a linear prediction coding (LPC) analysis may be performed on the signal to obtain spectrum parameters, which are corresponding to a coefficient of a synthesis filter. A fixed codebook contribution and an adaptive codebook contribution may be mixed to serve as the excitation of the synthesis filter. The synthesis filter outputs a reconstructed signal, consistent with the output of the synthesis filter of the decoding terminal in FIG. 2. A perceptual weighting is performed on a residual difference between the reconstructed signal and the preprocessed signal, and an analysis-by-synthesis search is performed to respectively find adaptive codebook parameters and fixed codebook parameters to be used for the excitation of the filter.
G.729.1 represents a latest new generation speech encoding/decoding standard. This embedded speech encoding/decoding standard may be characterized by layered coding that can be capable of providing an audio quality from narrowband to wideband in a bit rate range of 8 kb/s-32 kb/s; As such, it can be well adaptive to a channel as it allows to discard outer layer code streams according to the channel condition during the transmission, FIG. 3 is a systematic block diagram of a G.729.1 encoder, and FIGS. 4A and 4B are a systematic block diagram of a G.729.1 decoder. Referring to FIGS. 3, 4A, and 4B, the encoding/decoding of a core layer of the G.729.1 can be based on a CELP model. It can be known from FIG. 3 that, when the encoding rate is higher than 14 kb/s, a time-domain aliasing cancellation (TDAC) coder may be activated to encode a residual signal between a low sub-band input signal and a signal locally synthesized by the CELP encoder at a bit rate of 12 kb/s and a high sub-band signal, respectively. It can be known from FIGS. 4A and 4B that, when the decoding rate is higher than 14 kb/s, the decoding terminal should respectively decode signal components of the high sub-band and the low sub-band, a TDAC decoder then decodes a residual signal component of the low sub-band, and the residual signal component is added with a low band signal component reconstructed by a CELP decoder to obtain a final reconstructed low band signal component. As the TDAC encoding algorithm utilizes the reconstructed signal component of the CELP encoder at the encoding terminal, and at the same time, the TDAC decoding algorithm utilizes the reconstructed signal component of the CELP decoder at the decoding terminal, the synchronization between the reconstructed signal of the CELP encoding terminal and the reconstructed signal of the CELP decoding terminal provides a method of ensuring the correctness of the TDAC encoding/decoding algorithm. In order to ensure the synchronization between the reconstructed signals of the encoding and decoding terminals, the synchronization between the status of the CELP encoder and the status of the CELP decoder should be ensured.
FIG. 5 is a schematic structural view of a CELP encoder in G.729.1 in the prior art, and FIG. 6 is a schematic structural view of a CELP decoder in G.729.1 in the prior art. Referring to FIG. 5, the CELP model used for the narrowband portion in G.729.1 can support two rates, namely, 8 kb/s and 12 kb/s, and the synthesis filter for reconstructing the narrowband signal component in the encoding terminal respectively reserves two status rates, namely, 8 kb/s and 12 kb/s. In the encoding terminal, if the current encoding rate is 8 kb/s, a core-layer excitation signal calculated by a core-layer G.729 encoder is used to excite a synthesis corresponding to 8 kb/s, and the status of the synthesis filter is updated. If the current encoding rate is equal to or higher than 12 kb/s, an enhancement layer excitation signal is used to excite a synthesis filter corresponding to 12 kb/s, and the status of the synthesis filter is updated. Referring to FIG. 6, the decoding terminal utilizes one synthesis filter, calculates a corresponding excitation according to the received actual code stream, performs synthesis filtering, and updates the status of the filter. The synthesis filters at two encoding rates at the encoding terminal and the synthesis filter at the decoding terminal uses the same quantized LPC filter coefficient.
As for the two encoding rates, namely, 8 kb/s and 12 kb/s, the encoding terminal adopts two independent excitation synthesis modules to generate corresponding excitations, performs synthesis filtering on the corresponding synthesis filters, and updates the synthesis filters. The decoding terminal adopts one synthesis filter, calculates the excitation signal according to the received parameter, performs synthesis filtering, and updates the synthesis filter. If the encoding rate is not switched between 8 kb/s and 12 kb/s, the reconstructed signals of the encoding and decoding terminals are fully synchronous. However, if the switching between the two encoding rates occurs, the synchronization between the reconstructed signals of the encoding and decoding terminals cannot be ensured, thus affecting the correctness of the encoding/decoding algorithm, and eventually affecting the quality of the reconstructed signal of the decoding terminal.