The aim of rate allocation of an image compression codec is to find the coding parameters that optimize the image quality for a given target size, or to minimize the target size for a selected quality. While it is straightforward to define the size of an encoded image, the definition of “image quality” and thus of a quality metric is much harder. The mean squared error between the original and the reconstructed image (MSE) is the most popular metric, but it is also known to be only a poor model for visual significance.
One known technique for encoding images is the JPEG 2000 image encoding format described in ISO/IEC document 15444-1. The JPEG2000 image compression algorithm uses the wavelet transformation as linear decorrelation filter and the EBCOT (“Embedded Bitplane Coding by Truncation”) algorithm for rate allocation.
In generating an image to JPEG 2000 format, e.g., during JPEG 2000 encoding the wavelet transform step generates a set of wavelet transformed coefficients describing the image. These wavelet transformed coefficients, in the form of values, are partitioned spatially into subsets called codeblocks. Each codeblock comprises a rectangular array of coefficient values. The wavelet transformed coefficients of a single code-block all belong to a single contiguous subrectangle of the image and all belong to a single frequency sub-band generated by the applied wavelet transform. This frequency sub-band (and hence the corresponding code-block) corresponds to one of four specific types “flavors” (LL, LH, HL, or HH) according to the formulae, e.g., filter, with which it is generated by the wavelet transform. Information regarding which of the four possible filters which were used to generate a codeblock may be made available for use in later processing of the wavelet transformed coefficients corresponding to the generated codeblock as part of an encoding process. LL corresponds to a Low-pass horizontal, Low-pass vertical sub-band. LH corresponds to a Low-pass horizontal, High-pass vertical sub-band. HL corresponds to a High-pass horizontal, Low-pass vertical sub-band. HH corresponds to a High-pass horizontal, High-pass vertical sub-band.
Then next step in JPEG-2000 coding converts the real-valued wavelet transformed coefficients into integer values by a process called quantization. Quantization can be described as a process that maps each value in a subset of the real line to a particular value in that subset. In JPEG 2000, quantization is used to replace each real-valued wavelet coefficient by an integer-valued quantized wavelet transformed coefficient. The set of integer-valued quantized wavelet transformed coefficients are then input to the EBCOT for rate allocation and encoding. In JPEG 2000, the EBCOT algorithm measures the rate-distortion curve for all codeblocks. Distortion is usually defined as mean squared error (MSE), and rate is the number of bits required to encode the data. In such a system, the MSE is used to measure the error that results if less than all of the bits of all of the quantized wavelet transformed coefficients of a codeblock are provided to a decoder relative to the decoding result that would be achieved if all of the bits of all of the quantized wavelet transformed coefficients corresponding to a codeblock were made available to a decoder. Selecting all those bitplanes for encoding whose slope in a rate distortion curve is steeper than a given threshold is equivalent to a (discrete) Lagrangian optimization process that selects the minimal mean squared error under the constraint of a user selected output rate. Due to the discrepancy between MSE used to control encoding and the perceived visual quality, reconstructed images sometimes show annoying artifacts.
Various attempts at improving upon the MSE approach have been limited in their success for a variety of reasons. Some approaches have attempted to make modifications to the encoding process using fractional moments which can be very computationally complex to determine. One such fractional moment based approach described in D. Taubman: “High performance scalable image compression with EBCOT”, IEEE Transactions on Image Processing, Vol. 9 No. 7, pp. 1151-1170, (2000) multiplies the MSE metric used in the EBCOT framework by a masking factor which is generated and used on a per codeblock basis. However, the computation of the fractional moment is a complex and time-consuming operation thereby significantly reducing the usefulness of such an approach.
More advanced approaches to improve coding have attempted to adjust quantizer bucket size dynamically on a per-coefficient basis but such approaches can be complex to implement, are incompatible with the JPEG2000 baseline, and are thus only available within part 2 of the JPEG2000 standard even if one is willing to accept the additional complexity associated with such a technique.
In view of the above, it should be appreciated that there is a need for improved methods of implementing visual masking and/or methods of controlling rate allocation as part of an image encoding process. It is desirable that at least some methods could be used with JPEG2000 encoding and/or other types of encoding which support use of an error metric such as an MSE in controlling the encoding process to achieve a desired coding rate or to satisfy a coding target size constraint. It would be highly desirable if the methods could take advantage of variations in an image at the codeblock level and could be implemented without adding a significant amount of complexity to the encoding process.