1. Field of the Invention
The present invention relates primarily to the field of data compression, and in particular to a method for representation of sign in entropy codes.
Portions of the disclosure of this patent document contains material that is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure as it appears in the Patent and Trademark Office file or records, but otherwise reserves all rights whatsoever.
2. Background Art
Computer systems are increasingly being used to access, store, and/or process large amounts of data (e.g. audio and video files). It is common to compress data in a computer system so that it can be more easily stored or transmitted. Often the original data and the compressed data has an associated “sign” (positive or negative) that must be associated with the data.
An important aspect of encoding schemes is how they represent the sign of data (often via a “sign bit”). Fixed-width representations of data have two common techniques, namely the two's complement for integers and a leading sign bit for the mantissa of floating point numbers. These two techniques are the standards for the internal representations of integers and floating point numbers in computers. Lossless and lossy JPEG use the one's complement method to represent the sign of variable-width data.
FIG. 1 shows a representation of a lossless JPEG prediction kernel. Here pixel values at pixel positions a, b, and c are available to both the encoder and decoder prior to processing X. The prediction residual for pixel X is defined as: r=y−X, where y can be one of the functions mentioned below, and the choice for the y function is defined in the scan header of the compressed data so that both encoder and decoder use the same value. For example, y can be one of:    (A) y=0    (B) y=a    (C) y=b    (D) y=c    (E) y=a+b−c    (F) y=a+(b−c)/2    (G) y=b+(a−c)/2    (H) y=(a+b)/2
The prediction residual is computed using modulo 216 and is expressed as a pair of symbols, namely the category and the magnitude. As is known to those of ordinary skill, the first symbol, namely the category, represents the number of bits needed to encode the magnitude. This symbol is Huffman coded.
For example, if the prediction residual for X is 68, an additional 7 bits are needed to uniquely identify the value 68. This prediction residual is then mapped into a two-tuple (category 7, 7-bit code for 68). The compressed representation for the prediction residual consists of this Huffman codeword for category 7, followed by the 7-bit representation of the magnitude. In general, if the value of the residual is non-negative, the code for the magnitude is its direct binary representation. If on the other hand, the residual is negative, the code for the magnitude is the one's complement of its absolute value. This means that the codeword for negative residuals always start with a zero bit.
Lossy JPEG uses differential coding for the DC coefficients due to the high correlation of DC values among adjacent blocks. For 8-bit-per-pixel data, the DC differentials can take values in the range [−2047, 2047]. This range is divided into 12 size categories, where the i-th category includes all differentials that can be represented by i bits. After a table lookup, each DC differential can be expressed by the pair (size, amplitude), where size is defined as the bits needed to represent the amplitude, and the amplitude is simply the amplitude of the differential. Only the first value of this pair, viz. size is Huffman coded.
Given a DC residual value, its amplitude is calculated as: if the residual is non-negative, the amplitude is its binary representation with size bits of precision. If the residual is negative, the amplitude is the one's complement of its absolute value.
Similarly for 8-bit-per-pixel data, AC coefficients may take any value in the range [−1023, 1023]. This range is divided into 10 size categories, and just like before each AC coefficient can be described by the pair (size, amplitude). Since most AC coefficients are zero after quantization, only the nonzero AC coefficients need to be coded. These coefficients are processed in a zigzag order, which allows for a more efficient operation of the run-length coder. The coder yields the value of the next nonzero AC coefficient and a run, which is the number of zero AC coefficients preceding the present one. Hence, each nonzero AC coefficient can be represented by the pair [run/size, amplitude]. The value of the run/size is Huffman coded, and the value of the amplitude (calculated just like the DC coefficient case) is appended to the code.
Entropy Coding Using Adaptive Prefix Codes
Representing all kinds of data in a numerical form, this scheme encodes both non-negative and negative integers including zero. This scheme has particular application to data sets that are clustered about the zero integer, such as image data sets that have been transformed via a wavelet transform or a discrete cosine transform followed by quantization. Assume the integer to be encoded is denoted by “N”, its absolute value as “A” and the number of significant bits in the direct binary representation of A as “L” (this is also the power of 2 encoded by the most significant non-zero bit). The entropy code is constructed as L zero bits, followed by a place-holder 1 bit to mark the end of the zeroes, followed by a value portion of length L. For N>0, the value portion is the direct binary representation of N. For N<0, the value portion is the direct binary representation of A with bit L (the bit encoding 2^L) cleared. For N=0, L=0 and the value portion has zero length.
The table in FIG. 2 shows the encoding of a few integers, and it can be seen that the codeword for zero (the most frequently occurring integer in image data sets) is the shortest (just one bit), followed by positive and negative one (three bits), and so on. This scheme, when applied to image data, assumes the frequency of data is centered around zero, and so does not require a first pass through the file to determine character frequency like in Huffman's coding scheme.
As illustrated, prior art contains several methods for representing the signs of fixed- and variable-width data. The optimal choice of sign representation depends on the encoding scheme being used and on the nature of the data being encoded. The choice will typically be based on issues like CPU performance and ease of programming. Thus it is useful to introduce a new method of sign representation which, for both encoding and decoding data, is simple to program and requires minimal CPU usage.