This invention relates to the field of compression and decompression of data. Information is represented in a computer system by binary data in the form of 1s and 0s. Binary data are often maintained in a data storage device. In a computer system, data storage is a limited resource. To more efficiently use data storage resources, data are often compressed prior to storage so that less storage area is required. Upon retrieval, the data are decompressed for use. The need for compression can be demonstrated by describing the way that images are represented in a computer system, the transformation of such images into a form suitable for printing, and the storage problems associated with such images. This discussion is followed by descriptions of compression techniques and prior art approaches to compression.
If a person were to look closely at a television screen, computer display, magazine page, etc., he would see that an image is made up of hundreds or thousands of tiny dots, where each dot is a different color. These dots are known as picture elements, or “pixels,” when they are on a computer display and as dots when printed on a page. The color of each pixel is represented by a number value. To store an image in a computer memory, the number value of each pixel of the picture is stored. The number value typically represents the color and intensity of the pixel.
The accuracy with which a document can be reproduced depends on the “resolution” of the pixels that make up the document. The resolution of a pixel is determined by the range of the number value used to describe that pixel. The range of the number value is limited by the number of “bits” in the memory available to describe each pixel (a bit is a binary number having a value of one (1) or zero (0)). The greater the number of bits available per pixel, the greater the resolution of the document. For example, when only one bit per pixel is available for storage, only two values are available for the pixel. If two bits are available, four levels of color or intensity are available. While greater resolution is desirable, it can lead to greater use of data storage. For example, if each pixel is represented by a 32-bit binary number, 320,000 bits of information would be required to represent a 100×100 pixel image. Such information is stored in what is referred to as a “Frame Buffer” or gray array (“G array”).
A black and white printer has resolution of only one bit per pixel or dot. That is, the printer is only capable of printing a black dot at a location or of leaving the location blank. When an image is to be printed on a black and white printer, the image must be transformed so that its bit resolution matches the bit resolution of the printer. This transformation is known as “thresholding” and consists of determining, for each pixel in the source image, whether the dot to be printed at the corresponding location on the printed page is to be black or white.
Although the printer can only print a black and white image, a printed image can appear to have many different shades of gray depending on the pattern of black and white dots. When every other dot is black, for example, the resulting printed image will appear gray, because the human eye blends the tiny dots together. Many printers are capable of printing 600 dots per inch in the horizontal and vertical directions. Because of the large number of tiny dots, other shades of gray can be simulated by the relative percentage of black and white dots in any region. The more black dots in a region, the darker that region appears.
As noted above, when thresholding, a decision is made for each pixel, based on its original color in the source image, of whether to print a black or white dot on the page for that pixel. Consider a thresholding scheme where each pixel in the stored grayscale image may be represented by 8 bits, for example, giving 288=256 possible values. One thresholding method that does not produce very realistic images is to assign a black value to all image pixels with a value of 128 (out of 256) or above, and a white value to all image pixels with a value of 127 or below. Using thresholding, an entire multi-bit depth frame buffer can be compressed into a one bit per pixel buffer. However, the resulting image is “aliased” (appears like steps or contains jagged edges) and does not approximate the original image. To produce better images, a threshold matrix is generated and used to determine the thresholded value of an image pixel.
A threshold matrix uses different threshold values for an image pixel, depending on the address of the image pixel in the array. Thus, each cell of the frame buffer corresponds to a threshold matrix cell which has an independent threshold level. The threshold matrix need not be the same size and is often smaller than the G array. For example, at one location, an image pixel may be thresholded to black if its value is 128 or above, while an image pixel at another location may be black only if its value is 225 or higher. The result of applying the threshold matrix is an array of ones (1s) and zeros (0s) that could be printed to represent the original continuous tone image.
FIG. 1A depicts a frame buffer (G array), with indices i and j (G[i][j]). FIG. 1B depicts a threshold matrix (T array), with indices i′ and j′ (T[i′][j′]). FIG. 1C depicts the resulting output or pixel array (P array) with i rows and j columns (P[i][j]). Thus, for example, if the pixel maintains a value of 123 (G11 of FIG. 1A) and the threshold level is 128 (T11 of FIG. 1B), the resulting output value is 0 (P11 of FIG. 1C) because the pixel value is less than the threshold level. Hence the resulting pixel array is created by thresholding the G array as follows:
            P      ⁡              [        i        ]              ⁡          [      j      ]        =      {                            1                                                    if              ⁢                                                          ⁢                                                G                  ⁡                                      [                    i                    ]                                                  ⁡                                  [                  j                  ]                                                      ≥                                          T                ⁡                                  [                                      i                    ′                                    ]                                            ⁡                              [                                  j                  ′                                ]                                                                          0                          otherwise                    where G[i][j] is an array of the same dimensions as P but takes on many values, typically 0, 1, . . . , 255. T[i′][j′] is a threshold array, a matrix of dimension n threshold rows by m threshold columns and taking on values in a range like G, typically 1, 2, . . . , 255. For the thresholding (i′, j′) is a function of (i, j) typically:i′=i modulo n j′=j modulo m where modulo means the remainder after division.
After this thresholding step, the entire page can be represented in a memory of ones (1s) and zeros (0s) by the same number of bits as there are dots on the page. Even at one bit per dot, the amount of memory required can be substantial. For a page that is 8.5 by 11 inches and has a resolution of 600 dots per inch (dpi), the amount of memory needed is approximately 4.2 megabytes of memory (if monochrome). Such printer memory is referred to as a “buffer”. Memory is an expensive component, and it is advantageous to reduce the amount of memory required in a printer buffer. In the past, this has been accomplished by applying a “compression algorithm” to the data in the buffer. Despite the significant compression which may arise from thresholding, further compression is desired.
There are currently compression schemes for single bit data. Some of these schemes work preferentially better on text data and some work better on image data. Some of these schemes include facsimile standards, such as the ITU standards, using Huffman encoded run lengths and the JBIG (Joint Bi-level Image Group) standard.
There are two distinct families of prior art compression schemes: lossy and lossless. Lossless compression guarantees that no data will be lost upon a compression and decompression sequence. For example, one lossless compression scheme accomplishes this guarantee by searching the data for any repeating sequences such as “001001001001001001”. Using this lossless compression scheme, the sequence “001” would be stored along with the number of times it recurs - six. However, lossless compression schemes may not provide a satisfactory level of compression, e.g., due to the absence of repeating sequences in the source data for the above example. Nonetheless, due to its accuracy, lossless compression is used when storing database records, spreadsheets, or word processing files.
A lossy scheme achieves a greater level of compression while risking a loss of a certain amount of accuracy. However, certain types of stored information do not require perfect accuracy, such as graphics images and digitized voice. As a result, lossy compression is often utilized on such types of information.
The most interesting prior art method is set forth in the JBIG standard. Pixels are processed in the usual scan order, i.e., row i is entirely processed before row i+1 and in each row, column i precedes column i+1. Encoding the current pixel uses a context, which comprises a set of nearby pixels that have already been encoded. For example, if ten pixels were used as a neighborhood, there would be 210 possible contexts. Based on the frequency with which the current context has been previously encountered, the encoder makes a prediction and estimates the probability that the prediction is accurate. If the estimated probability is reliable and near to certainty, then an arithmetic entropy encoder can losslessly encode the prediction error (e.g., zero (0) for no error, one (1) for error) with much less than one bit per pixel. The decoder has the same context and frequency information, and is therefore able to interpret the zero (0) or one (1) to determine the true value of the pixel.
See also U.S. Pat. No. 5,442,458, issued to Rabbani et al., which is directed to an encoding method for image bitplanes using conditioning contexts based on pixels from the current and previous bitplanes.