The present invention is concerned with linear prediction based audio coding and, in particular, linear prediction based audio coding using spectrum coding.
The classical approach for quantization and coding in the frequency domain is to take (overlapping) windows of the signal, perform a time-frequency transform, apply a perceptual model and quantize the individual frequencies with an entropy coder, such as an arithmetic coder [1]. The perceptual model is basically a weighting function which is multiplied onto the spectral lines such that errors in each weighted spectral line has an equal perceptual impact. All weighted lines can thus be quantized with the same accuracy, and the overall accuracy determines the compromise between perceptual quality and bit-consumption.
In AAC and the frequency domain mode of USAC (non-TCX), the perceptual model was defined band-wise such that a group of spectral lines (the spectral band) would have the same weight. These weights are known as scale factors, since they define by what factor the band is scaled. Further, the scale factors were differentially encoded.
In TCX-domain, the weights are not encoded using scale factors, but by an LPC model [2] which defines the spectral envelope, that is the overall shape of the spectrum. The LPC is used because it allows smooth switching between TCX and ACELP. However, the LPC does not correspond well to the perceptual model, which should be much smoother, whereby a process known as weighting is applied to the LPC such that the weighted LPC approximately corresponds to the desired perceptual model.
In the TCX-domain of USAC, spectral lines are encoded by an arithmetic coder. An arithmetic coder is based on assigning probabilities to all possible configurations of the signal, such that high probability values can be encoded with a small number of bits, such that bit-consumption is minimized. To estimate the probability distribution of spectral lines, the codec employs a probability model that predicts the signal distribution based on prior, already coded lines in the time-frequency space. The prior lines are known as the context of the current line to encode [3].
Recently, NTT proposed a method for improving the context of the arithmetic coder (compare [4]). It is based on using the LTP to determine approximate positions of harmonic lines (comp-filter) and rearranging the spectral lines such that magnitude prediction from the context is more efficient.
Generally speaking, the better the probability distribution estimation is, the more efficient the compression achieved by the entropy coding is. It would be favorable to have a concept at hand which would enable the achievement of a probability distribution estimation of similar quality as obtainable using any of the above-outlined techniques, but at a reduced complexity.