With continuous progress of communications technologies, users are imposing an increasingly high requirement on voice quality. Generally, voice quality is improved by increasing bandwidth of the voice quality. If a signal whose bandwidth is wider is encoded in a traditional encoding manner, a bit rate is greatly improved and as a result, it is difficult to implement encoding because of a limitation condition of current network bandwidth. Therefore, encoding needs to be performed on a signal whose bandwidth is wider in a case in which a bit rate is unchanged or slightly changed, and a solution proposed for this issue is to use a bandwidth extension technology. The bandwidth extension technology may be completed in a time domain or a frequency domain, and bandwidth extension is completed in the time domain in the present invention.
A basic principle of performing bandwidth extension in a time domain is that two different processing methods are used for a low band signal and a high band signal. For a low band signal in an original signal, encoding is performed at an encoder side according to a requirement using various encoders; at a decoder side, a decoder corresponding to the encoder of the encoder side is used to decode and restore the low band signal. For a high band signal, at the encoder side, an encoder used for the low band signal is used to obtain a low frequency encoding parameter so as to predict a high band excitation signal; a linear predictive coding (LPC) analysis, for example, is performed on a high band signal of the original signal to obtain a high frequency LPC coefficient. The high band excitation signal is filtered using a synthesis filter determined according to the LPC coefficient so as to obtain a predicted high band signal; the predicted high band signal is compared with the high band signal in the original signal so as to obtain a high frequency gain adjustment parameter; the high frequency gain adjustment parameter and the LPC coefficient are transferred to the decoder side to restore the high band signal. At the decoder side, the low frequency encoding parameter extracted during decoding of the low band signal is used to restore the high band excitation signal; the LPC coefficient is used to generate the synthesis filter; the high band excitation signal is filtered using the synthesis filter so as to restore the predicted high band signal; the predicted high band signal is adjusted using the high frequency gain adjustment parameter so as to obtain a final high band signal; the high band signal and the low band signal are combined to obtain a final output signal.
In the foregoing technology of performing bandwidth extension in a time domain, a high band signal is restored in a condition of a specific rate; however, a performance indicator is deficient. It can be learned by comparing a frequency spectrum of a restored output signal with a frequency spectrum of an original signal that, for a voiced sound of a general period, there is always an extremely strong harmonic component in a restored high band signal. However, a high band signal in an authentic voice signal does not have an extremely strong harmonic characteristic. Therefore, this difference causes that there is an obvious mechanical sound when the restored signal sounds.
An objective of embodiments of the present invention is to improve the foregoing technology of performing bandwidth extension in the time domain, so as to reduce or even remove the mechanical sound in the restored signal.