The present invention is related to processing of signals and, more particularly, to encoding and decoding of signals such as digital visual or auditory data.
Perceptual coding is a known technique for reducing the bit rate of a digital signal by utilizing an advantageous model of the destination, e.g., by specifying the removal of portions of the signal that are unlikely to be perceived by a human user. FIG. 1 illustrates the basic structure of a transform coding system. Applying perceptual coding to such a system typically amounts to applying different levels of distortion to different transform coefficients, according to the impact those coefficients have on human perception. More distortion can be applied to less-perceptible coefficients, while less distortion must be applied to more-perceptible coefficients. A fundamental problem with applying an arbitrary perceptual model to such a system is that most lossy compression schemes rely on the decoder having knowledge of how the source data was distorted. This is usually necessary for the inverse quantization step (set forth as 150 in FIG. 1), in which values decoded 140 from the entropy code are scaled according to the quantization 120 applied during compression. If the encoder is to apply a sophisticated perceptual model to determine how to quantize each coefficient the decoder must somehow obtain or recompute the resulting quantization intervals to perform inverse quantization.
The simplest approach to addressing this issue is to use predefined quantization intervals, based on a priori information known about the coefficients, such as the frequencies and orientations of the corresponding basis functions. The quantization of a coefficient, accordingly, depends only on the position of that coefficient in the transform and is independent of the surrounding context. See, e.g., ITU-T Rec. T.81, “Digital Compression and Coding of Continuous-Tone Still Images—Requirements and Guidelines,” International Telecommunication Union, CCITT (September 1992) (IPEG standard, ISO/IEC 10918-1). Although this approach is very efficient, it is very limited and cannot take advantage of any perceptual phenomena beyond those that are separated out by the transform 110. A more powerful approach is to define a perceptual model that can be applied in the decoder during decompression. During compression, the encoder dynamically computes a quantization interval for each coefficient based on information that will be available during decoding; the decoder uses the same model to recompute the quantization interval for each coefficient based on the values of the coefficients decoded so far. See, e.g., ISO/IEC 15444-1:2000, “JPEG2000 Part I: Image Coding System,” Final Committee Draft Version 1.0 (Mar. 16, 2000) (JPEG2000 standard); ISO/IEC JTC 15444-2:2000, “IPEG2000 Part II: Extensions,” Final Committee Draft, (Dec. 7, 2000) (point-wise extended masking extension). While a well-designed system using such recomputed quantization can yield dramatic improvements over predefined quantization, it is still limited in that the perceptual model utilized cannot involve any information lost during quantization, and the quantization of a coefficient cannot depend on any information that is transmitted after that coefficient in the bitstream. The most flexible approach in the prior art is to include some additional side information in the coded bitstream, thereby giving the decoder some hints about how the coefficient values were quantized. Unfortunately, side-information adds bits into the bitstream and, thus, lowers the compression ratio.
Accordingly, there is a need for a new approach that can fully exploit perceptual modeling techniques while avoiding the need for side information.