To control the costs of page printers, substantial efforts have been directed at reducing the amount of memory required to store page data. Recently, 600 dot per inch resolution laser printers have been introduced. Such printers handle both text, line art and half-tone images. To minimize the amount of memory required in such printers, data compression techniques have been applied to image data. For instance, run length data compression is used by most processors when transferring data to the printer. While run length compression is successful when used with text and line art, when used with image data, its compression capability is much less satisfactory.
Certain types of images are classified as "ordered dither" or "error diffused". An ordered dither image (also called "clustered") is a half-tone image that includes half-tone gray-level representations. Such an image generally reflects substantial data redundancy and lends itself to a data encoding technique that is lossless in nature. A "lossless" compression technique is one which enables image compression and decompression with no loss of information present in the original image. Error diffused images (a form of "dispersed" dither), by contrast, exhibit little redundancy in their data and require different methods of compression. A "Bayer" dither is another example of a dispersed dither.
To accommodate a variety of image types, while still maintaining a reduced amount of on-board image memory, page printers employ multiple data compression techniques to obtain most efficient compression of image data. In addition to run-length (i.e. "Huffman") encoding, page printers employ varieties of the Lempel-Ziv data compression technique, a cache-based predictor technique, a lossy technique and others. The Lempel-Ziv procedure has several variations but, in general, compresses an input data stream by storing, in a string table, strings of characters encountered in an input data stream. A "compressor" searches the input data stream to determine a longest match to a stored string in the string table. When a match is found, a code corresponding to the matching string is issued. As the compressor encounters more and more strings, the string table becomes "smarter" and succeedingly longer runs of characters are able to be compressed by the issued codes. At the receiving end, an identical string table is constructed which enables a decoding of the received code. It is known that Lempel-Ziv techniques are especially effective at compressing text and line art data.
The cache-based predictor (CBP) technique is described in U.S. patent application Ser. No. 07/963,201, entitled "Cache-Based Data Compression/Decompression, to Rosenberg et al and assigned to the same assignee as this application. The disclosure of the Rosenberg et al Patent Application is incorporated herein by reference.
The basic premise of CBP is to use past received data to predict future data. In CBP, most recently encountered bytes are cached and, depending upon how recently a byte has been used, a variable length bit code is output for each byte encountered. There are a variety of implementations for CBP. Some employ a table of each possible byte value and the predicted byte that will follow it. Others, in lieu of providing a table of each possible byte, employ bytes received in a prior byte stream and compare those bytes with a byte, for which a prediction is to be made.
As an example of CBP, a printer is supplied with a bit map memory that comprises a raster image of pixels, each pixel represented by a binary bit value. Each raster scan row is divided into a series of 8 bit bytes. Surrounding pixels which are spatially positioned so as to have been previously decoded, are used as a context for deciding which of plural caches is to be used to store a most recently used byte. Bit values from surrounding pixels are combined to form a number, enabling a context values (or index) to be derived which specifies which cache should be used.
When a new data byte is encountered that is to be transmitted, a cache is addressed having an address of the context byte which is vertically aligned (for example) on an immediately previous scan row of the pixel image. It is likely that the most recently used entry in the addressed cache will be identical to the uncompressed byte value. The byte value is then matched with the values in the addressed cache and if a match is found, a code word indicating the position of the matching value in the addressed cache is outputted. The cache is then adapted by moving if necessary, the matching value up to the most recently used entry in the addressed cache and the procedure repeats.
If no match is found in the addressed cache a "not found" code word is outputted and the actual byte value is transmitted. The non-matching byte value is inserted in the addressed cache and the cache is adapted so as to move the displaced byte into a lower level of the cache and to displace a byte already in that lower level, etc. The procedure then repeats.
In summary, a current row byte will always be directed to a cache whose address is the value of the context byte immediately above the current byte. The current byte becomes a context byte when the next scan row is accessed. If the context byte value is already an address of a cache, a new cache does not need to be created. If the context byte value is new, a new cache is established with the context byte as its address.
In the above CBP implementation, a context value consists of the value of a data segment from a raster scan line immediately above the current line. That context value is used as a cache address. The context value can also be formed from a combination of previously decoded data segments. Thus, not only the data segment directly above the current data segment may be used, but a piece of the data segment immediately above and to the left of the current data segment may also be used.
CBP performs well when used to compress typical cluster image data. Bayer and fine clustered image data and data comprising multiple vertical lines.
Lossy compression is generally used as a last resort because it results in lost data from the image which can not be recovered. Images that generally fall into a category where lossy compression is applied are images that have been half-toned using an error diffusion procedure. Lossy compression involves a reduction in size of image data (in a data cell of an image) to an approximation cell which replaces the original cell. For instance, a 4.times.4 bit cell of raster video data may be compressed to four bits. Thus, a 4:1 compression is achieved, but substantial data is lost and is not recoverable upon decompression.
In U.S. patent application Ser. No. 07/940,111 of Campbell et al., assigned to the same assignee as this application, an adaptive data compression procedure is described for a page printer. Campbell et al. describe a method for choosing a data compression procedure which enables optimum data compression to be achieved. However, the Campbell et al. procedure requires that each compression technique actually be tried upon a portion of a received image to determine which compression technique works best. While the Campbell et al. procedure is highly effective in achieving substantial data compression, at times, substantial processing time is required to arrive at a decision as to which method of data compression is best for the received image.
Accordingly, it is an object of this invention to provide a page printer which includes a variety of data compression procedures and is adaptive in choosing a data compression procedure, based upon characteristics of received data.
It is another object of this invention to provide a page printer which intelligently selects a data compression technique so as to enable efficient use of limited printer memory.
It is yet another object of this invention to provide a page printer with a system for data compression which does not require a test and retry procedure to decide upon a compression procedure to be employed.