In many broadband or ultra-broadband audio codecs, when a code rate is low, Band Width Extension (BWE) parameter encoding is used for spectra in a broadband portion or an ultra-broadband portion, where the BWE parameter encoding is characterized in that a few bits are used, the bandwidth is ensured, and the quality is acceptable; and when the code rate is high, quantization encoding is performed on the spectra in the broadband or ultra-broadband portion, where the quantization encoding is characterized in that, many bits are used, the precision is high, and the quality is good.
For structure diagrams of an audio encoding/decoding system supporting broadband or ultra-broadband in the prior art, reference may be made to FIG. 1 and FIG. 2. FIG. 1 is a structure diagram of an audio encoding system supporting broadband or ultra-broadband in the prior art. As shown in FIG. 1, the encoding system adopts a layered structure. A core encoder encodes low-frequency information, so as to output a first layer code stream. A BWE encoder encodes a high-frequency band spectrum by using a few bits, so as to output a second layer code stream. A quantization encoder quantizes and encodes the high-frequency band spectrum by using remaining bits, so as to output a third layer code stream.
FIG. 2 is a structure diagram of an audio decoding system supporting broadband or ultra-broadband in the prior art. As shown in FIG. 2, the decoding system also adopts a layered structure. A core decoder is configured to decode the low-frequency information of the first layer code stream. A BWE decoder is configured to decode BWE information of the second layer code stream. A dequantization decoder is configured to decode and dequantize high-frequency band information of the third layer code stream of the remaining bits. Finally, the decoding system synthesizes the frequency bands of the three layers of code streams to output a band-synthesized audio signal. Generally, the signal output by the core decoder is a time-domain signal, and signals output by the BWE decoder and the dequantization decoder are frequency-domain signals, so the frequency-domain signals of the second and third layer code streams are converted into the time-domain signals when the frequency bands are synthesized, so as to output a band-synthesized time-domain audio signal.
In the process of decoding, for a high-frequency band spectral signal, when the code rate is low, the decoding system can only decode the second layer code stream, so as to obtain BWE-encoded information, thereby ensuring basic high-frequency band quality; and when the code rate is high, the decoding system can further decode the third layer code stream to obtain better high-frequency band quality.
In this layered structure, in many cases, because bits of the third layer code stream reserved for the spectral quantization encoding are insufficient, the quantizer performs bit allocation. The quantizer allocates many bits to some important frequency bands to perform high precision quantization, while allocates a few bits to some less important frequency bands to perform low precision quantization, and even allocates no bit to some least important frequency bands. That is, the quantizer does not quantize the least important frequency bands.
In the prior art, several processing methods are performed on spectra of the unquantized frequency bands: 1. Retain a BWE spectrum; 2. Copy a part of spectra obtained through dequantization, adjust energy of the part of spectra, and then fill the part of spectra in the unquantized frequency bands; and 3. Set the unquantized spectra to 0, or directly fill the unquantized spectra with noise.
During implementation of the present invention, the inventors find that the prior art causes obvious noise and a bad acoustic effect because of one or more of the following reasons.
1. If the BWE spectra are retained on the spectra of the unquantized frequency bands, the quantized spectra and the BWE spectra retained on the spectra of the unquantized frequency bands are mismatched for position information and/or energy information, thereby introducing noise. 2. If a lot of spectra are unquantized and set to 0 or filled with noise, noise is directly introduced to the spectra of the unquantized frequency bands. Noise is introduced during frequency band synthesis after decoding because of the mismatching or the zero setting and noise filling, thereby deteriorating the acoustic effect of the audio signal.