There are typically two types of compression schemes for binary image data obtained from scanning operations in multifunction devices. One scheme is the straightforward compression using one binary compression algorithm, such as G3, G4, Deflate, or JBIG2 etc. The other scheme entails converting binary to two-layer black mask MRC scheme (as disclosed, for example, in U.S. Patent Publication 2004/0114195). In the latter scheme, the binary image is segmented into a foreground mask plane and a background image plane. The background plane is the image portion of the input document. The image portion of the document is first converted into continuous tone data, and is then compressed using JPEG, while the foreground mask plane, which represents the text and line art content of the page, is compressed using a binary compression algorithm, such G4, Deflate, or JBIG2.
The problem with the latter scheme described above is that it requires a good level of segmentation between text and image portions of the input document, especially trying to do so in a binary domain. The segmentation defects are tradeoff between achieving small file size and reasonable image quality in a cost-effective implementation.
U.S. Patent Publication 2004/0114195, for a “SYSTEM FOR SELECTING A COMPRESSION METHOD FOR IMAGE DATA,” by F. F. Ebner et al., published Jun. 17, 2004, hereby incorporated by reference in its entirety, teaches a method of analyzing an image data set, performing a morphological operation on the image data, and then deriving a metric used to estimate compression performance.
U.S. Patent Publication 2005/0244060, for “REFORMATTING BINARY IMAGE DATA TO GENERATE SMALLER COMPRESSED IMAGE DATA SIZE,” by Ramesh Nagarajan et al, published Nov. 3, 2005, also hereby incorporated by reference in its entirety, discloses systems and methods for reformatting binary image data into two or more planes to improve compression thereof.
The system and method disclosed herein include a lossless mixed raster content (MRC) generation scheme. Such a technique provides smaller file size and good image quality using a simple cost-effective implementation. It also does not require a complex image segmentation approach to achieve the desired file size improvement. In binary compression schemes, it is known that G4 (CCITT) and JBIG2 compression is good for text compression and G3 (CCITT) and Deflate compression performs better for image content. In general, each binary compression algorithm may be good for compressing either text or image, but not both.
One aspect of the present disclosure is to segment or predict the compression ratio of the mixed content binary image and divide the image into regions (e.g., region A, region B) and then deploy appropriate compression schemes for those regions to achieve better overall compression. In one embodiment, the regions are divided into a text portion and an image portion. The advantage of such a technique is that typical segmentation defects will not exist when the two compression schemes used for the regions of a page are lossless.
Disclosed in embodiments herein is a method for compressing binary image data, comprising: segmenting binary image data into a first plane having text and a second plane having an image; and separately compressing the text in the first plane and the image in the second plane.
Also disclosed in embodiments herein is a system for compressing binary image data, comprising: a segmenter for receiving binary image data and dividing said binary image data into a first plane having text and a second plane having an image; and a compressor for separately compressing the text in the first plane and the image in the second plane to produce reduced-size representations of said first and second planes.
Further disclosed in embodiments herein is multifunction apparatus for compressing binary image data, comprising: an image source of binary image data; memory for storing image data; a segmenter for retrieving binary image data from said memory and dividing said binary image data into a first plane having text and a second plane having an image; a compressor for separately compressing the text in the first plane and the image in the second plane to produce reduced-size representations of said first and second planes; and a controller for controlling the sequence of operation of at least said segmenter and compressor to produce a compressed image including reduced-size representations of said first and second planes.