1. Field of the Invention
The invention relates generally to data and image compression. More specifically, the invention relates to binary encoding for data and image compression, particularly where adaptive quantization and discrete wavelet transforms are utilized.
2. Description of the Related Art
In image and/or data compression, through a process known as encoding, a set of values, such as text or numerical data that are obtained or input externally, can be encoded into binary form (1s or 0s). One way of encoding is to simply convert each decimal number or code for text (such as ASCII numerical designations) into a fixed number of bits (word size). For instance, the numbers 23, 128, and 100 would be encoded into binary as the sequence: 00010101 1000000 01100100. This raw or pure binary code serves no further compression function, since it merely takes the data and represents it in binary. Such encoding is inefficient where the number of zeroes greatly outweigh the non-zero, and especially where such zero data values are consecutive, creating a large "run" of zeroes in binary. Several methods have been developed particularly in the field of digital communications to compress data during binary conversion. Among two widely-used such methods of binary encoding for image or data compression are Huffman Coding and Run-Length Encoding.
Classical Huffman Coding is a variable length coding technique which consists of coding each possible value y.sub.i (i=1, . . . , N) inside a given data set S of N possible data values by codewords of L.sub.i bits each. The goal behind Huffman Coding is to minimize .SIGMA.L.sub.i P(y.sub.i), where P(y.sub.i) is the probability of the value y.sub.i occurring in data set S that is to be encoded. The codewords are chosen in order to make them distinguishable from each other. The codewords have a variable length, for instance, for a data set S={0, 1, 2, 3, 4} the Huffman Coding may use the mapping {0=0, 1=10, 2=110, 3=1110, 4=1111}. If P(0)&gt;&gt;P(1)&gt;&gt;P(2)&gt;&gt;P(3)&gt;&gt;P(4), this technique may be more efficient than straight fixed length binary representation. The Huffman Coding is efficient primarily when coding data sets S with a small N or that have a small variance, since L.sub.i grows in size almost linearly with an increase in N, the number of values in the set. For this reason, a technique different from classical Huffman Coding known as Modified Huffman Coding has been developed and used in image that have larger N in their data sets or more variance.
In Modified Huffman Coding, the data value y.sub.i is encoded by a data structure having two fields: a Range, (that identifies a set containing 2.sup.Range values) and a Pointer indicating a specific value inside Range. The field Range is coded using Huffman coding, and Pointer is expressed as a binary number having a size of Range bits. As an example, the values included in a Range of 1 to n values could be:
Range=0: values {0} (the field Pointer is not needed) PA0 Range=1: values {-1, 1} (the field Pointer needs 1 bit to indicate a specific value). PA0 Range=2: values {-3, -2, 2, 3} (the field Pointer needs 2 bits). PA0 Range=n: values {-2.sup.n +1, -2.sup.n +2, . . . , -2.sup.-1, 2.sup.n-1, . . . , 2.sup.n -2, 2.sup.n -1} (the field Pointer needs n bits).
For instance, if the data to encode are the integer values {-3, -2, -1, 0, 1, 2, 3}, a possible MHC for each of them could be that shown in the Table 1:
TABLE 1 ______________________________________ Huf. Code Pointer having Complete y.sub.i Range for Range Range bits code (MHC) ______________________________________ -3 2 11 11 1111 -2 2 11 10 1110 -1 1 10 1 101 0 0 0 -- 0 1 1 10 0 100 2 2 11 00 1100 3 2 11 01 1101 ______________________________________
The value 0 is coded only by a word 0 (i.e., the Huffman coding of the Range and no other bits). The value 3 is coded by 1101, where the underlined 11 is the Huffman code of the Range 2, and the following 2 (=Range) bits, 01, "point" to the value 3 among the possible values {-3, -2, 2, 3} that also have a Range=2 in the table. If P(0) is high, the above described approach is more efficient than a normal fixed length binary coding where each value would be coded by 3 bits regardless of its probability of occurrence, since only one bit in the MHC of Table 1 is used for zero values. The MHC is naturally designed for a table look-up architecture and thus can be more efficient for both encoding and decoding.
Another technique known as Zero Run Length Coding (ZRLC) is a standard technique for encoding a data set containing a large number of consecutive or "runs" of zero values. ZRLC consists of encoding only the values different from zero (using Huffman Coding or some other coding) and then interleaving these codewords by a code that specifies the number of zeroes that, according to a manner known both to the coder and to the decoder, divides two consecutive non-zero values.
In traditional ZRLC, the encoded zero run data is structured using two segments: a run length and non-zero value. For instance, instead of coding the data stream:
{0000005000-6780000-120000000000014500000000023} only the following data are coded:
{[6, 5][3, -6][0, 78][4, -12][12, 1][0, 45], [9, 23]}
This code (where an indicates a run length of zeroes) indicates that 6 zeroes followed by the value 5, then 3 zeroes followed by the value -6, then 0 zeroes followed by the value 78, . . . , etc.
When either of these techniques are utilized in the compression of images, the nature of the image compression prior to encoding scheme should point to the best choice of method. In order to consider whether Huffman Coding or run length coding should be used, the nature of the image or application (such as videoconferencing) may need to be analyzed and considered. For images with high entropy, run-length encoding may not provide as high a compression ratio as Huffman Coding and vice-a-versa. The inability to tune the binary encoding process to the characteristic of the data set (or subset of the data set), may result in overall compression ratios that are not optimal.
In the art of imaging, after an image is captured (by a device such as a digital camera) and perhaps "color interpolated" (missing color components are determined to give the image full color resolution) then the image is often "compressed" (prior to binary encoding) or reduce the total number of bits that would be needed to represent the image. "Primary" image compression and subsequent binary encoding of that compressed data plays a key role in multimedia applications such as videoconferencing, digital imaging and video streaming over a network. Primary image compression schemes for such applications should be designed to reduce the bit-rate of storage or transmission of the image while maintaining acceptable image quality for the specific application.
One commonly used primary image compression technique is known as JPEG (Joint Photographic Experts Group) which transforms pixels of an input image using the well-known Discrete Cosine Transform (DCT). The resulting transformed pixel values are quantized or mapped to smaller set of values in order to achieve compression. The quality of a compressed image that is decompressed will depend greatly on how the quantization of the transformed pixels are performed. The compression ratio (the size of the original raw image compared to the compressed image) will also be affected by the quantization, but can be further affected by the subsequent binary encoding of the data after quantization.
The subsequent binary encoding of JPEG compressed image data is limited, if run length encoding is used, by "blocking". For JPEG, an image is divided into blocks of pixels such as 8.times.8, or 16.times.6 blocks. These blocks are processed independently of each other and thus, the maximum run-length possible is the size of the block (64 or 256). Thus, if run-length encoding is used, the run-length value is 6 bits or 8 bits wide. Hence, for JPEG, run-length encoding may held fixed in the number of bits comprising the run-length. Where block-based coding is not utilized, such fixed number of bits for coding each "run" (i.e., the number of consecutive zeroes), can become a serious limit, since the longest value for a run depends from the dimension of the whole image.
Other primary image compression schemes which achieve high compression ratios and also acceptable decompressed image quality, may generate image "sub-bands" or image frequency regions, which unlike JPEG blocks, are not fixed but varying in size since they do not divide the image in blocks. One such primary image compression scheme based upon the Discrete Wavelet Transform (DWT) is presented in related U.S. patent application, Ser. No. 09/083,383, filed May 21, 1998, entitled "The Compression of Color Images Based on a 2-Dimensional DWT" (hereinafter "DWT Patent"). In such a DWT-based scheme, each sub-band and channel (color plane or difference of color planes) may have properties that justify the use of Huffman Coding rather than run-length encoding especially in sub-bands with high entropy.
If images are to be compressed and then encoded on a digital camera or other imaging device, ordinary run-length encoding for JPEG is inadequate. Thus, there is a need for a run-length encoding scheme which allows a variable and exceptionally large run-length value to be encoded while keeping a fixed length structure such that decoding can be more real-time. Further, where primary image compression generates data in stages (e.g., one stage for each sub-band), where the data at each stage has properties which favor the use of one type of encoding over another, there is a need to provide an adaptive encoding process so that each stage may be treated with the most efficient encoding possible. Such mechanisms would maximize the compression gained during encoding and thus, reduce the storage/transfer size required for an image or for other data.