In digital systems image format documents are often compressed to save storage costs or to reduce transmission time through a transmission channel. Lossless compression can be applied to these documents that can achieve very good compression on regions of the document that are computer rendered such as characters and graphics. However, areas of the document that contain scanned image data will not compress well. Compression technologies such as JPEG can be applied to the document that will work well on scanned, continuous tone, areas of the document. Image quality problems arise with this compression technology, and transform-coding technologies in general, with high contrast edges that are produced by computer rendered objects. The solution to this problem is to apply different compression technologies to the document to optimize image quality and compressibility.
A method for digital image compression of a raster image is disclosed which uses different compression methods for selected parts of the image and which adjusts the compression and segmentation parameters to control the tradeoff of image quality and compression. The image, including rendering tags that can accompany each pixel, is encoded into a single data stream for efficient handling by disk, memory and I/O systems. The uniqueness of this system is in the content-dependent separation of the image into lossy and lossless regimes, the transmission of only those blocks containing information, and the adjustable segmentation and compression parameters used to control the image data rate compression rate) averaged over extremely small intervals (typically eight scan lines).
The graphics arts world, and Scitex in particular, as exemplified in the TIFF/IT standard (ISO12639:1997E, xe2x80x9cGraphic Technology Prepress Digital Data Exchange Tag Image File Format for Image Technology (TIFF/IT)xe2x80x9d) have separated documents into continuous tone (CT) pictures and line work (LW), maintaining different resolutions for each and applying different compression techniques to each (JPEG and run length encoding, respectively). The links between the two image planes are found in the LW channel.
U.S. Pat. No. 5,225,911 to Buckley et al. uses similar encodings but replaced the LW channel with several data streams including mask, color, and rendering tags.
Compressed image printing has been used for over a decade for binary images using one of several standard or proprietary formats: CCITT, JBIG, Xerox Adaptive (Raster Encoding Standard, as discussed in Buckley, Interpress. These single plan compression schemes are lossless and, although often quite effective (20:1) can, for some images, give little or no compression.
U.S. patent application Ser. No. 09/206,487 also separates the image into two planes, but each plane is completely sent. Three data streams are used (two image planes and a separation mask) and no mechanism exists to control local data rate.
JPEG is a standard for compressing continuous tone images. The acronym stands for Joint Photographic Experts Group. JPEG is divided into a baseline system that offers a limited set of capabilities, and a set of optional extended system features. JPEG provides a lossy high-compression image coding/decoding capability. In addition to this lossy coding capability, JPEG incorporates progressive transmission and a lossless scheme as well.
JPEG utilizes a discrete cosine transform (DCT) as part of the encoding process to provide a representation of the image that is more suitable to lossy compression. The DCT transforms the image from a spatial representation to a frequency representation. Once in the frequency domain, the coefficients are quantized to achieve compression. A lossless encoding is used after quantization to further improve compression performance. The decoder executes the inverse operations to reconstruct the image.
Dictionary based compression methods use the principle of replacing substrings in a data stream with a codeword that identifies that substring in a dictionary. This dictionary can be static if knowledge of the input stream and statistics are known or can be adaptive. Adaptive dictionary schemes are better at handling data streams where the statistics are not known or vary.
Many adaptive dictionary coders are based on two related techniques developed by Ziv and Lempel. The two methods are often referred to as LZ77 (or LZ1) and LZ78 (or LZ2). Both methods use a simple approach to achieve adaptive compression. A substring of text is replaced with a pointer to a location where the string has occurred previously. Thus the dictionary is all or a portion of the input stream that has been processed previously. Using the previous strings from the input stream often makes a good choice for the dictionary, as substrings that have occurred will likely reoccur. The other advantage to this scheme is that the dictionary is transmitted essentially at no cost as the decoder can generate the dictionary from the previously coded input stream. The many variations of LZ coding differ primarily in how the pointers are represented and what the pointers are allowed to refer to.
LZ1 is a relatively easy to implement version of a dictionary coder. The dictionary in this case is a sliding window containing the previous data from the input stream. The encoder searches this window for the longest match to the current substring in the input stream. Searching can be accelerated by indexing prior substrings with a tree, hash table, or binary search tree. Decoding for LZ1 is very fast in that each code word is an array lookup and a length to copy to the output (uncoded) data stream.
In contrast to LZ1, where pointers can refer to any substring in the window of prior data, the LZ2 method places restrictions on which substrings can be referenced. However, LZ2 does not have a window to limit how far back substrings can be referenced. This avoids the inefficiency of having more than one coded representation for the same string that can occur frequently in LZ1.
LZ2 builds the dictionary by matching the current substring from the input stream to a dictionary that is stored. This stored dictionary is adaptively generated based on the contents of the input stream. As each input substring is searched in the dictionary, the longest match will be located, but starting at the current symbol in the input stream. So if the character xe2x80x9caxe2x80x9d was the first part of a substring, then only substrings that started with xe2x80x9caxe2x80x9d would be searched. Generally this leads to a good match of input substring to substrings in the dictionary. However, if a substring xe2x80x9cbacdefxe2x80x9d were in the dictionary, then xe2x80x9cacdefxe2x80x9d from the input stream would not match this entry since the substring in the dictionary starts with xe2x80x9cbxe2x80x9d. This is different from LZ1, which is allowed to generate a best match anywhere in the window and could generate a pointer to xe2x80x9cacdefxe2x80x9d.
U.S. Pat. No. 5,479,587 discloses a print buffer minimization method in which the raster data is compressed by trying different compression procedures with increasing compression ratios until the raster data is compressed sufficiently to fit in a given print buffer. Each time, a compression procedure with a higher compression ratio is selected from a predefined repertoire of such procedures, ranging from lossless ones such as run-length encoding to lossy ones. Generally, lossless encoding is efficient on text and line art data while lossy encoding is effective on image data. However, this method may produce poor print quality when the nature of the raster page calls for lossy compression in order to achieve a predetermined compression ratio. This is because only one of the selected compression procedure is summarily applied across each strip of the page and when the strip contains both image data as well as text or line art data, the lossy compression procedure will generally blur sharp lines that usually delineate text or line art data or may introduce undesirable artifacts.
European Patent Publication No. 0597571 discloses a method in which the types of objects in a page are first extracted and the boundary of each object determined before rasterization. Appropriate compression procedures are selectively applied to each type of objects. In this way, lossless compression procedures may optimally be applied to text or line art objects while lossy compression procedures may be applied to image objects. The method operates at the display list level that is an intermediate form between the page description file and the rasterized page. Objects and their types are determined by parsing from the high-level, implicitly object-defining commands of the PDL in the display list. This requires knowledge of the particular brand and version of PDL commands as well as how to reconstruct a certain object from these implicit manifestations. In any case, it appears that all but the simplest boundaries such as objects enclosed in rectangular blocks are practically determinable from such deciphering at the display list level.
U.S. Pat. No. 5,982,937 discloses a hybrid lossless/lossy compression process whereby a page of raster data is analyzed to distinguish text or line art objects from image or photo objects. This is accomplished by a procedure that analyzes and recognizes structures in the raster data in the form of color patches. A patch is regarded as a spread of connected pixels of the same color. Once the patches are recognized, they are discriminated between a Type 1 or a Type 2 patch, depending on whether or not the patch can be efficiently compressed by the first type of compression procedure (typically Run Length Lossless Encoding). Each patch has a size measured by the number of pixels therein (xe2x80x9cPatchPixelCountxe2x80x9d). Type 1 patch has a PatchPixelCount greater or equal to a predetermined number, D1, and Type 2 patch has a PatchPixelCount less than D1. In a preferred implementation, D1 is from 6 to 8. The first (lossless) compression procedure is then applied to Type 1 patches and the second compression procedure (typically JPEG lossy) is applied to Type 2 patches. Thus, appropriate compression procedures are applied to each type of data to optimally attain efficient compression while maintaining quality.
The references described herein and above are incorporated by reference for their teachings.
In accordance with the invention, there is provided a method and apparatus for compressing and decompressing electronic documents, with maximum intradocument independence, and maximum flexibility in optimization of compression modes.
In accordance with one aspect of the invention, there is provided a method of compressing a received document, comprising: receiving documents containing unknown combinations of a plural data types, including combinations of scanned data, computer rendered data, compressed data and/or rendering tags; dividing the received image into strips of blocks determining from the image itself, which data types are present in each block; compressing data of each data type present in each block with a compression method optimized for its data type. The described method further provides that scanned data may be further segmented into plural scanned data types, and each data type is compressed in said compressing data step with a compression method optimized for said scanned image data type. The described method may also provide that where a received data type is compressed data, the process may include the additional functions of determining a compression ratio thereof, and accepting the compressed data for use, or decompressing and recompressing the data, based on acceptability of said compression ratio determination. The deectibed method may also provide that where some or all of a received data type is pre-determined, the process may use this information to select a compression method for this data type.
In accordance with another aspect of the invention, there is provided a method of compressing received documents including: receiving documents containing unknown combinations of a plural data types, including combinations of scanned data, computer rendered data, compressed data and/or rendering tags; classifying each data type present in the received document; determining optimum compression of each data type present, which may include a non-compressing pass through of compressed data; and from said optimum compression determination, generating a decompression instruction stream, useful in decompression of the document, and which includes decompression instructions and document data.
In accordance with still another aspect of the present invention, there is provided a data structure, for describing a compressed document including unknown combinations of plural data types, including combinations of scanned data, computer rendered data, compressed data and/or rendering tags, comprising: segregation of data in accordance with compression methods thereof; and segregation of data into independent block and strip document portions, whereby each block document portion and each strip document portion may be decompressed without reference to any other block and strip, respectively.