A typical encoder/decoder system based on transform coding is illustrated in FIG. 1.
Major steps in transform coding are:
A. Transform a short audio frame (20-40 milliseconds) to a frequency domain, e.g., through the Modified Discrete Cosine Transform (MDCT).
B. Split the MDCT vector X(k) into multiple bands (sub-vectors SV1, SV2, . . . ), as illustrated in FIG. 2. Typically, the width of the bands increases towards higher frequencies [1].
C. Calculate the energy in each band. This gives an approximation of the spectrum envelope, as illustrated in FIG. 3.
D. The spectrum envelope is quantized, and the quantization indices are transmitted to the decoder.
E. A residual vector is obtained by scaling the MDCT vector with the envelope gains, e.g., the residual vector is formed by the MDCT sub-vectors (SV1, SV2, . . . ) scaled to unit Root-Mean-Square (RMS) energy.
F. Bits for quantization of different residual sub-vectors are assigned based on envelope energies. Due to a limited bit budget, some of the sub-vectors are not assigned any bits. This is illustrated in FIG. 4, where sub-vectors corresponding to envelope gains below a threshold TH are not assigned any bits.
G. Residual sub-vectors are quantized according to the assigned bits, and quantization indices are transmitted to the decoder. Residual quantization can, for example, be performed with the Factorial Pulse Coding (FPC) scheme [2].
H. Residual sub-vectors with zero bits assigned are not coded, but instead noise-filled at the decoder. This is achieved by creating a Virtual Codebook (VC) from coded sub-vectors by concatenating the perceptually relevant coefficients of the decoded spectrum. The VC creates content in the non-coded residual sub-vectors.
I. At the decoder, the MDCT vector is reconstructed by up-scaling residual sub-vectors with corresponding envelope gains, and the inverse MDCT is used to reconstruct the time-domain audio frame.
A drawback of the conventional noise-fill scheme, e.g. as in [1], is that it in step H creates audible distortion in the reconstructed audio signal when used with the FPC scheme.