The present application is concerned with context-based entropy coding of sample values of a spectral envelope and the usage thereof in audio coding/compression.
Many modern state of the art lossy audio coders such as described in [1] and [2] are based on an MDCT transform and use both irrelevancy reduction and redundancy reduction to minimize the necessitated bitrate for a given perceptual quality. Irrelevancy reduction typically exploits the perceptual limitations of the human hearing system in order to reduce the representation precision or remove frequency information that is not perceptually relevant. Redundancy reduction is applied to exploit the statistical structure or correlation in order to achieve the most compact representation of the remaining data, typically by using statistical modeling in conjunction with entropy coding.
Among others, parametric coding concepts are used to efficiently code audio content. Using parametric coding, portions of the audio signal such as, for example, portions of the spectrogram thereof, are described using parameters rather than using actual time domain audio samples or the like. For example, portions of the spectrogram of an audio signal may be synthesized at the decoder side with the data stream merely comprising parameters such as the spectral envelope and optional further parameters controlling synthesizing, in order to adapt the synthesized spectrogram portion to the spectral envelope transmitted. A new technique of such kind is Spectral Band Replication (SBR) according to which a core codec is used to code and transmit the low frequency component of an audio signal, whereas a transmitted spectral envelope is used at the decoding side so as to spectrally shape/form spectral replications of a reconstruction of the low frequency band component of the audio signal so as to synthesize the high frequency band component of the audio signal at the decoding side.
A spectral envelope within the framework of coding techniques outlined above, is transmitted within a data stream at some suitable spectrotemporal resolution. In a way similar to the transmission of spectral envelope sample values, scale factors for scaling spectral line coefficients or frequency domain coefficients such as MDCT coefficients, are likewise transmitted in some suitable spectrotemporal resolution which is coarser than the original spectral line resolution, coarser for example in a spectral sense.
A fixed Huffman coding table could be used in order to convey information on the samples describing a spectral envelope or scale factors or frequency domain coefficients. An improved approach is to use context coding such as, for example, described in [2] and [3], where the context used to select the probability distribution for encoding a value extends both across time and frequency. An individual spectral line such as an MDCT coefficient value, is the real projection of a complex spectral line and it may appear somewhat random in nature even when the magnitude of the complex spectral line is constant across time, but the phase varies from one frame to the next. This necessitates a quite complex scheme of context selection, quantization, and mapping for good results as described in [3].
In image coding, the contexts used are typically two-dimensional across the x and y axis of an image such as, for example, in [4]. In image coding, the values are in the linear domain or the power-law domain, such as for example by use of gamma adjustment. Additionally, a single fixed linear prediction may be used in each context as a plane fitting and rudimentary edge detection mechanism, and the prediction error may be coded. Parametric Golomb or Golomb-Rice coding may be used for coding the prediction errors. Run length coding is additionally used to compensate for the difficulties of directly encoding very low entropy signals, below 1 bit per sample, for example, using a bit based coder.
However, despite the improvements in connection with the coding of scale factors and/or spectral envelopes, there is still need for an improved concept for coding sample values of a spectral envelope. Accordingly, it is an object of the present invention to provide a concept for coding spectral values of a spectral envelope.