This invention relates to the field of compression and decompression of data.
Information is represented in a computer system by binary data in the form of 1s and 0s. This binary data is often maintained in a data storage device. In a computer system, data storage is a limited resource. To more efficiently use data storage resources, data is often compressed prior to storage so that less storage area is required. When the data is retrieved, it is decompressed for use. The need for compression can be demonstrated by describing the way that images are represented in a computer system, the transformation of such images into a form suitable for printing, and the storage problems associated with such images. This discussion is followed by descriptions of compression techniques and prior art approaches to compression.
If a person were to look closely at a television screen, computer display, magazine page, etc., he would see that an image is made up of hundreds or thousands of tiny dots, where each dot is a different color. These dots are known as picture elements, or xe2x80x9cpixels,xe2x80x9d when they are on a computer display and as dots when printed on a page. The color of each pixel is represented by a number value. To store an image in a computer memory, the number value of each pixel of the picture is stored. The number value typically represents the color and intensity of the pixel.
The accuracy with which a document can be reproduced is dependent on the xe2x80x9cresolutionxe2x80x9d of the pixels that make up the document. The resolution of a pixel is determined by the range of the number value used to describe that pixel. The range of the number value is limited by the number of xe2x80x9cbitsxe2x80x9d in the memory available to describe each pixel (a bit is a binary number having a value of 1 or 0). The greater the number of bits available per pixel, the greater the resolution of the document. For example, when only one bit per pixel is available for storage, only two values are available for the pixel. If two bits are available, four levels of color or intensity are available. While greater resolution is desirable, it can lead to greater use of data storage. For example, if each pixel is represented by a 32-bit binary number, 320,000 bits of information would be required to represent a 100xc3x97100 pixel image. Such information is stored in what is referred to as a xe2x80x9cFrame Bufferxe2x80x9d or gray array (xe2x80x9cG arrayxe2x80x9d).
A black and white printer has resolution of only one bit per pixel or dot. That is, the printer is only capable of printing a black dot at a location or of leaving the location blank. When an image is to be printed on a black and white printer, the image must be transformed so that its bit resolution matches the bit resolution of the printer. This transformation is known as xe2x80x9cthresholdingxe2x80x9d and consists of determining, for each pixel in the source image, whether the dot to be printed at the corresponding location on the printed page is to be black or white.
Although the printer can only do black and white printing, a printed image can appear to have many different shades of gray depending on the pattern of black and white dots. When every other dot is black, for example, the resulting printed image will appear gray, because the human eye blends the tiny dots together. Many printers are capable of printing 600 dots per inch in the horizontal and vertical directions. Because of the large number of tiny dots, other shades of gray can be simulated by the relative percentage of black and white dots in any region. The more black dots in a region, the darker that region appears.
As noted above, when thresholding, a decision is made for each pixel, based on its original color in the source image, of whether to print a black or white dot on the page for that pixel. Consider a thresholding scheme where each pixel in the stored gray-scale image may be represented by 8 bits, for example, giving 256 (28) possible values. One thresholding method that does not produce very realistic images is to assign a black value to all image pixels with a value of 128 (out of 256) or above, and a white value to all image pixels with a value of 127 or below. Using thresholding, an entire multi-bit depth frame buffer can be compressed into a one bit per pixel buffer. However, the resulting image is xe2x80x9caliasedxe2x80x9d (appears like steps or contains jagged edges) and does not approximate the original image. To produce better images, a threshold matrix is generated and used to determine the thresholded value of an image pixel.
A threshold matrix uses different threshold values for an image pixel, depending on the address of the image pixel in the array. Thus, each cell of the frame buffer corresponds to a threshold matrix cell which has an independent threshold level. The threshold matrix need not be the same size and is often smaller than the G array. For example, at one location, an image pixel may be thresholded to black if its value is 128 or above, while an image pixel at another location may be black only if its value is 225 or higher. The result of applying the threshold matrix is an array of 1s and 0s that could be printed to represent the original continuous tone image.
FIG. 1A depicts a frame buffer (G array), with indices i and j (G[i][j]). FIG. 1B depicts a threshold matrix (T array), with indices ixe2x80x2 and jxe2x80x2 (T[ixe2x80x2][jxe2x80x2]). FIG. 1C depicts the resulting output or pixel array (P array) with i rows and j columns (P[i][j]). Thus, for example, if the pixel maintains a value of 123 (G11 of FIG. 1A) and the threshold level is 128 (T11 of FIG. 1B), the resulting value is 0 (P11 of FIG. 1C) due to the fact that the pixel value is less than the threshold level. Hence the resulting pixel array is created by thresholding the G array as follows:             P      ⁡              [        i        ]              ⁡          [      j      ]        =      {                                                      1              ⁢                              xe2x80x83                            ⁢              if              ⁢                              xe2x80x83                            ⁢                                                G                  ⁡                                      [                    i                    ]                                                  ⁡                                  [                  j                  ]                                                      ≥                                          T                ⁡                                  [                                      i                    xe2x80x2                                    ]                                            ⁡                              [                                  j                  xe2x80x2                                ]                                                                                      0            ⁢                          xe2x80x83                        ⁢            otherwise                              
where G[i][j] is an array of the same dimensions as P but takes on many values, typically 0, 1, . . . , 255. T[ixe2x80x2][jxe2x80x2] is a threshold array, a matrix of dimension n threshold rows by m threshold columns and taking on values in a range like G, typically 1, 2, . . . , 255. For the thresholding (ixe2x80x2,jxe2x80x2) is a function of (i,j) typically:
ixe2x80x2=i modulo n
jxe2x80x2=j modulo m
where modulo means the remainder after division.
After this thresholding step, the entire page can be represented in a memory of 1s and 0s by the same number of bits as there are dots on the page. Even at 1 bit per dot, the amount of memory required can be substantial. For a page that is 8.5 by 11 inches and has a resolution of 600 dots per inch (dpi), the amount of memory needed is approximately 4.2 megabytes of memory (if monochrome). This memory in a printer is referred to as a xe2x80x9cbufferxe2x80x9d. Memory is an expensive component, and it is an advantage to reduce the amount of memory required in a printer buffer. In the past, this has been accomplished by applying a xe2x80x9ccompression algorithmxe2x80x9d to the data in the buffer. Despite the significant compression which may arise from thresholding, further compression is desired.
There are currently compression schemes for single bit data. Some of these schemes work preferentially better on data that was originally text and some work better on data that was originally an image. Some of these schemes include facsimile standards, such as the ITU standards, using Huffman encoded run lengths and the JBIG (Joint Bi-level Image Group) standard.
There are two distinct families of prior art compression schemes: lossy and lossless. Lossless compression guarantees that no data will be lost upon a compression and decompression sequence. For example, one lossless compression scheme accomplishes this guarantee by searching the data for any repeating sequences such as xe2x80x9c001001001001001001xe2x80x9d. Using this lossless compression scheme, the sequence xe2x80x9c001xe2x80x9d would be stored along with the number of times it recursxe2x80x94six. However, lossless compression schemes may not provide a satisfactory level of compression, e.g., due to the absence of repeating sequences in the source data for the above example. Nonetheless, due to its accuracy, lossless compression is used when storing database records, spreadsheets, or word processing files.
A lossy scheme achieves a greater level of compression while risking a loss of a certain amount of accuracy. However, certain types of stored information do not require perfect accuracy, such as graphics images and digitized voice. As a result, lossy compression is often utilized on such types of information.
The most interesting prior art method is set forth in the JBIG standard. Pixels are processed in the usual scan order, i.e., rowi is entirely processed before rowi+1 and in each row, columni precedes columni+1. Encoding the current pixel uses a context, which comprises a set of nearby pixels that have already been encoded. If, e.g., ten pixels were used as a neighborhood, there would be 210 possible contexts. Based on the frequency with which the current context has been previously encountered, the encoder makes a prediction and estimates the probability that the prediction is accurate. If the estimated probability is reliable and near to certainty, then an arithmetic entropy encoder can losslessly encode the prediction error (e.g., 0 for no error, 1 for error) with much less than 1 bit per pixel. The decoder has the same context and frequency information, and is therefore able to interpret the 0 or 1 to determine the true value of the pixel.
See also U.S. Pat. No. 5,442,458, issued to Rabbani et al., which is directed to an encoding method for image bitplanes using conditioning contexts based on pixels from the current and previous bitplanes.
The present invention provides a method and apparatus for compression of data. Whereas the performance of compression schemes of the prior art are often dependent on the type of data being compressed, the invention provides for the application of a plurality of compression schemes to the data such that improved compression ratios are achieved. A first embodiment provides for compression of each pixel by one of a plurality of different entropy-based compression schemes based upon a probability cost analysis. A second embodiment provides for compression of each pixel based on a hybrid context formed using a plurality of compression schemes for improved probability determination, and thus improved entropy encoding.
The first embodiment of the invention provides a method for choosing the most effective compression scheme per pixel. A multiplicity of compression schemes are utilized and a cost value is associated with each scheme on a pixel by pixel basis. After associating a cost with each scheme, the method selects between the most effective or lowest summed cost scheme on a pixel by pixel basis. The selected scheme provides a predicted value of the current pixel and an estimate of the probability that the prediction is correct. The correctness of the estimate, either true or false, is then encoded by an entropy encoder which uses the estimated probability to encode the true or false outcome in less than one bit, provided that the estimated probability is accurate and not in the vicinity of 0.5. Typically, the estimated probability is greater than 0.95. In a preferred embodiment, the entropy encoding is performed using an arithmetic encoding process.
In one specific embodiment, one of the compression schemes utilized is referred to as the xe2x80x9cinversexe2x80x9d scheme. The inverse scheme predicts the gray value of a frame buffer pixel. The scheme examines a set of recently scanned pixels to define a range within which the pixel under consideration is likely to fall. The range is determined as follows.
For each previously scanned pixel, the thresholded value and the associated threshold define a range. For example, if the threshold is 150 and the thresholded value is 1, the pre-thresholded value is in the range of 150 to 255. If the thresholded value is 0, the pre-thresholded value is in the range of 0 to 149. The resulting range is intersected with the similarly determined range from the next closest neighbor pixel.
The intersecting of the threshold matrix ranges continues for each previous neighbor pixel until all pixels have been analyzed or until there is a contradiction, namely that a range is encountered which has no common overlap with the most recently calculated range. A value within the most recently calculated range is then selected as an estimate of the value of the pixel under consideration. The difference between the corresponding threshold value and the estimated value is then calculated. Subsequently, a probability table is updated which records the number of times a binary one (1) or zero (0) resulted with that difference calculation. This probability table is used in the encoding stage.
A second scheme utilized in the specific embodiment above is one similar to that of JBIG, referred to as the xe2x80x9ccontextxe2x80x9d scheme. A set of pixels in a frame buffer are selected. Subsequent to the pixel set""s thresholding, the thresholded values are concatenated to provide a binary number. A frequency table is maintained which records the number of times a binary one (1) or a binary (0) occurs for each sequence of binary numbers for the pixel set. The concatenated binary number is used as an index into the frequency table, from which the probabilities for use in the encoding stage are derived.
In the second embodiment of the invention, a hybrid context is formed using a first number of bits of the hybrid context to store previous pixel information gathered using a first compression scheme, and using further portions of the hybrid context to store previous pixel information gathered using other compression schemes. Statistics are stored in a table indexed by the hybrid context, and entropy encoding is performed based on the probability information in the table indexed by the hybrid context for the current pixel. The combination of different probability determining schemes results in greater probability values for each pixel, and correspondingly higher compression ratios.
In a further specific embodiment, a thirteen bit hybrid context is formed using seven bits to store a quantized gray value difference determined using the inverse scheme, and using six bits to store recently scanned pixel values in the neighborhood in accordance with the context (JBIG) scheme. A table, indexed by the hybrid context, is formed containing the probability for a given pixel, having the respective six-bit JBIG context and the respective seven-bit gray value difference, to be xe2x80x9c0xe2x80x9d or xe2x80x9cIxe2x80x9d. An entropy encoder encodes the pixel value based on the prediction and the probability determined from the hybrid context.