In the field of digital communications, there are extremely widespread application requirements for voice, picture, audio, and video transmission, such as a phone call, an audio and video conference, broadcast television, and multimedia entertainment. To reduce a resource occupied in a process of storing or transmitting an audio or video signal, an audio and video compression and encoding technology comes into existence. Many different technical branches emerge in the development of the audio and video compression and encoding technology, where a technology in which a signal is encoding processed after being transformed from a time domain to a frequency domain is widely applied due to a good compression characteristic, and the technology is also referred to as a domain transformation encoding technology.
An increasing emphasis is placed on audio quality in communication transmission, therefore, there is a need to improve quality of a music signal as much as possible on a premise that voice quality is ensured. Meanwhile, the amount of information of an audio signal is extremely rich. Therefore, a code excited linear prediction (CELP) encoding mode of conventional voice cannot be adopted, instead, generally, to process the audio signal, a time domain signal is transformed into a frequency domain signal using an audio encoding technology of domain transformation encoding, thereby enhancing encoding quality of the audio signal.
In an existing audio encoding technology, generally, by adopting a transformation technology, such as fast Fourier transform (FFT) or modified discrete cosine transform (MDCT) or discrete cosine transform (DCT), a high frequency band signal in an audio signal is transformed from a time domain signal to a frequency domain signal, and then, the frequency domain signal is encoded.
In the case of a low bit rate, limited quantization bits cannot quantize all to-be-quantized audio signals. Therefore, an encoding device uses most bits to elaborately quantize relatively important low frequency band signals in the audio signals, that is, quantization parameters of the low frequency band signals occupy most bits, and only a few bits are used to roughly quantize and encode high frequency band signals in the audio signals to obtain frequency envelopes of the high frequency band signals. Then, the frequency envelopes of the high frequency band signals and the quantization parameters of the low frequency band signals are sent to a decoding device in a form of a bitstream. The quantization parameters of the low frequency band signals may include excitation signals and frequency envelopes. When being quantized, the low frequency band signals may first also be transformed from time domain signals to frequency domain signals, and then, the frequency domain signals are quantized and encoded into excitation signals.
Generally, the decoding device may restore the low frequency band signals according to the quantization parameters that are of the low frequency band signals and in the received bitstream, then acquire the excitation signals of the low frequency band signals according to the low frequency band signals, predict excitation signals of the high frequency band signals using a bandwidth extension (also referred to as BWE) technology and a spectrum filling technology and according to the excitation signals of the low frequency band signals, and modify the predicted excitation signals of the high frequency band signals according to the frequency envelopes that are of the high frequency band signals and in the bitstream, to obtain predicted high frequency band signals. Herein, the obtained high frequency band signals are frequency domain signals.
In the BWE technology, a highest frequency bin to which a bit is allocated may be a highest frequency bin to which an excitation signal is decoded, that is, no excitation signal is decoded on a frequency bin greater than the highest frequency bin. A frequency band greater than the highest frequency bin to which a bit is allocated may be referred to as a high frequency band, and a frequency band less than the highest frequency bin to which a bit is allocated may be referred to as a low frequency band. That an excitation signal of a high frequency band signal is predicted according to an excitation signal of a low frequency band signal may be as follows. The highest frequency bin to which a bit is allocated is considered as a center, an excitation signal of a low frequency band signal less than the highest frequency bin to which a bit is allocated is copied into a high frequency band signal that is greater than the highest frequency bin to which a bit is allocated and whose bandwidth is equal to bandwidth of the low frequency band signal, and the excitation signal is used as an excitation signal of the high frequency band signal.
The other approaches has the following disadvantages. Using the foregoing other approaches to predict a high frequency band signal, quality of the predicted high frequency band signal is relatively poor, thereby reducing auditory quality of an audio signal.