The Joint Photographic Experts Group (JPEG) published a standard for compressing image data which became commonly known as the “JPEG standard.” The JPEG standard is based on a discrete cosine transform (DCT) compression algorithm that uses Huffman encoding. The compression is limited to 8 bits/pixel. In an effort to provide better compression quality for a broader range of applications, the JPEG developed the “JPEG 2000 standard” (International Telecommunications Union (ITU) Recommendation T.800, August 2002). The JPEG 2000 standard is based on discrete wavelet transform (DWT) and adaptive binary arithmetic coding compression.
The JPEG 2000 standard generally sets forth the following approach. Input image data is partitioned into rectangular, non-overlapping tiles of equal size. The sample values in each tile are level shifted and the color data is decorrelated. A DWT is then applied to these pre-processed image samples. The DWT transform applies a number of filter banks to the pre-processed image samples and generates a set of wavelet coefficients for each tile. The wavelet coefficients are then quantized and thereafter subjected to arithmetic coding. Each subband of coefficients is encoded independently of the other subbands, and a block coding approach is used. Each subband of coefficients is partitioned into a set of rectangular blocks of coefficients called code-blocks. The code-blocks are independently encoded, and the encoded code-blocks are then formatted into a suitable bitstream.
The first part of encoding of the code-blocks is referred to as coefficient bit modeling, and in many cases viewed as the computational bottleneck of JPEG 2000 encoding systems. Two stages of coefficient bit modeling include context labeling and context word encoding. In context labeling, the coefficients in a code-block are processed bitplane by bitplane, commencing with the bitplane having the coefficient with the most significant non-zero bit in the code-block. For each coefficient in the bitplane, a context label is generated in one of three encoding passes, and each context label is used in context word encoding to generate a code that describes the coefficient in that bitplane. A coefficient becomes significant when the first non-zero magnitude bit is encountered. A straightforward implementation of the coefficient bit modeler codes the bitplanes in a bit-serial manner. However, a bit-serial implementation is likely to be very slow and consume hardware clock cycles on the order of 3×N2, where the code-block is N×N. Parallel architectures may be used to alleviate the large computational requirements of a serial approach but consume a large quantity of chip resources. The present invention may address one or more of the above issues.