1. Field of the Invention
The present invention relates generally to data processing and, more particularly, to data filtering and data compression for compound document pages including tristimulus spatial coordinate color image data.
2. Description of Related Art
Raster-based printers use a coding technique which codes each picture element, commonly called a "pixel," of alphanumeric character text or a computer graphic into a digital data format. A "compound document" includes both text and graphics, for example, an advertising page having both text and photographs. Data compression is used to reduce a data set for storage and transfer. Compressed raster data is output by a computer for decompression and printing by a hard copy apparatus such as a laser printer or ink-jet printer, facsimile machine, or the like. Reductions in the amount of total data needed to transfer a so complete page data set compensates for limitations in input/output ("I/O") data rates and I/O buffer sizes, particularly in a limited memory, hard copy apparatus that receives such raster-based data. With raster data, the goal is to reduce the quantity of data transferred without affecting the visual quality characteristics of the document page. The following descriptions assume knowledge of an average person skilled in the art of both raster-based printing and data compression techniques. As used herein the term "image data" refers to photographs or other digitally scanned, or otherwise produced, sophisticated graphics.
Computerized systems that utilize loss-less compression techniques generally do not perform well on image data. While computationally achieving a 100:1 compression on text and business graphics (line art, bar charts, and the like) data, these complex algorithms usually achieve less than a 2:1 compression of image data. As a corollary, while image data can be compressed effectively with a "lossy" algorithm without significantly affecting perceptible image quality (e.g., the JPEG industry standard for photographs--having a disadvantage of being relatively slow in and of itself), data compression solutions that rely solely on lossy algorithms visibly degrade text data (such as by leaving visual artifacts), even at relatively low levels of compression. Moreover, lossy compression techniques do not achieve the desirable high compression ratios. Still further, the advantages of JPEG-like compression over other techniques are reduced when compressing image data that have been scaled using a pixel-replication scaling algorithm common to rasterized compound documents (e.g., 150 dot-per-inch ("dpi") image data scaled up to a resolution of 300-dpi or 600-dpi).
Solutions that use a mix of lossy and loss-less data compression are often slow and complex. For example, text and image data are sometimes separated to different channels, one containing the images using a lossy compression technique, like JPEG, and the other using a loss-less compression technique for text and simple business graphics. This separation of data into individual channels can be slow and the results are dependent on the architecture of the rasterization engine that initially rasterized the compound document. Moreover, the use of a lossy algorithm sometimes requires custom decompression hardware to achieve acceptable data processing speeds, which adds to the cost of a hard copy product. Again, the advantages of a JPEG-type algorithm are still reduced for images that have been scaled. Moreover, the relatively slow nature of JPEG is not improved even when compressing high resolution pixel replicated image data.
Thus, there is a need for a fast, raster-based, data compression technique for the transmission of compound documents, particulary useful for hard copy printing.