Scanning and exporting color images to a network has started to become one of the standard features offered by digital multifunction devices. File size of a color image is an important factor while exporting color images. In addition to offering different resolutions, different compression schemes are being offered to reduce the file size of the color image that needs to be exported. One of the popular compression/file formats that are currently being offered is Mixed or Multiple Raster Content (MRC) representation.
The MRC representation of documents is versatile. It provides the ability to represent color images and either color or monochrome text. The MRC representation enables the use of multiple layers or “planes” for the purpose of representing the content of documents. The MRC representation is becoming increasingly important in the marketplace. It has been already established as the main color-fax standard. It is also offered as a selection in the Scan-to-Export feature, for example, in digital multifunction devices.
FIG. 1 shows one exemplary embodiment of a three-layer mixed raster content image data. As shown in FIG. 1, a document image 100 to be rendered using the mixed raster content format is generated using a background layer 110, a selector layer 120, and a foreground layer 130. A fourth, non-image data layer (not shown) may also be included in the mixed raster content image data file. The fourth layer often contains rendering hints which can be used by a rendering engine, such as Adobe® Acrobat®, to provide additional instructions on how particular pixels are to be rendered.
As shown in FIG. 1, the selector layer 120 is used to mask undifferentiated regions of color image data stored on the foreground layer 130 onto the background layer 110 to form the rendered image 100. In particular, the selector layer 120 contains high spatial frequency information for regions otherwise having slowly changing color information. In effect, regions whose color changes relatively slowly, if at all, are placed onto the foreground layer 130. In some MRC models, the foreground layer may also include color information regarding the pictorial regions. The shapes of those regions are then embedded into the selector layer 120. In contrast, regions having high color frequency, e.g., colors whose values change more significantly over very small spatial extents, are stored as continuous tone image data on the background layer 110. When the image represented by the data structure 100 is to be rendered or otherwise generated, the color information stored in the foreground layer 130 has spatial or shape attributes applied to it based on the binary information stored in the selector layer 120 and the resulting shaped color information is combined onto the background layer 110 to form the reconstructed layer 100.
An N-layer MRC model is also known, in which the foreground layer is separated into various independent sublayers. Each of the foreground sublayers is typically binarized having a specific color.
Black text typically accounts for 90% of office documents. One problem with the conventional three-layer MRC and N-layer MRC models is that black text is not always pure black since it is part of the foreground layer which takes color information directly from the original images. Attempts have been make to push black text towards pure black in the foreground layer. However, this may cause poor image quality in dark pictorial regions of the image, since the foreground layer does not differentiate between black text and black non-text pixels.