1. Field
The present disclosure generally relates to mixed raster content (MRC) images, and in particular, combining a three-layer MRC model with an N-layer MRC model.
2. Description of Related Art
Scanning and exporting color images to a network has started to become one of the standard features offered by digital multifunction devices. File size of a color image is an important factor while exporting color images. In addition to offering different resolutions, different compression schemes are being offered to reduce the file size of the color image that needs to be exported. One of the popular compression/file formats that are currently being offered is Mixed or Multiple Raster Content (MRC) representation. The MRC representation provides as a way to achieve high image quality with small file size.
The MRC representation of documents is versatile. It provides the ability to represent color images and either color or monochrome text. The MRC representation enables the use of multiple “planes” for the purpose of representing the content of documents. The MRC representation is becoming increasingly important in the marketplace. It has been already established as the main color-fax standard. It is also offered as a selection in the Scan-to-Export feature, for example, in digital multifunction devices.
An image may generally include different types of text features, for example, text features on white background areas, text features on light colored (e.g., yellow colored) background areas (e.g., newspaper), light colored text features on dark colored background areas (e.g., white colored text on black background area), and text features on colored background areas (e.g., text-on-tint).
FIG. 1 shows one exemplary embodiment of a three-layer mixed raster content image data. As shown in FIG. 1, a document image 100 to be rendered using the mixed raster content format is generated using a background layer 110, a selector layer 120, and a foreground layer 130. The foreground layer 130 and the background layer 110 are both multi-level, and the mask or selector layer 120 is bi-level. A fourth, non-image data layer (not shown) may also be included in the mixed raster content image data file. The fourth layer often contains rendering hints which can be used by a rendering engine, such as Adobe® Acrobat®, to provide additional information on how particular pixels are to be rendered.
As shown in FIG. 1, the selector layer 120 is used to mask undifferentiated regions of color image data stored on the foreground layer 130 onto the background layer 110 to form the rendered image 100. In particular, the selector layer 120 contains high spatial frequency information for regions otherwise having slowly changing color information. In effect, regions whose color changes relatively slowly, if at all, are placed onto the foreground layer 130. In some MRC models, the foreground layer may also include color information regarding the pictorial regions. The shapes of those regions are then embedded into the selector layer 120. In contrast, regions having high color frequency, e.g., colors whose values change more significantly over very small spatial extents, are stored as continuous tone image data on the background layer 110. When the image represented by the data structure 100 is to be rendered or otherwise generated, the color information stored in the foreground layer 130 has spatial or shape attributes applied to it based on the binary information stored in the selector layer 120 and the resulting shaped color information is combined onto the background layer 110 to form the reconstructed layer 100.
In general, in the three-layer MRC model, the final image is obtained by using the mask or selector layer 120 to select pixels from the other two layers (i.e., the foreground layer 130 and the background layer 110). When the mask layer pixel value is 1, the corresponding pixel from the foreground layer 130 is selected, and when the mask layer pixel value is 0, the corresponding pixel from the background layer 110 is selected.
Generally, the shape of text is defined in the selector plane 120 and the color information of the text is contained in the foreground layer 130. In some of the three-layer MRC models, pictorial information is shared between the foreground layer 130 and the background layer 110 with color information of the pictorial areas being in the foreground layer 130. Most of these three-layer MRC models have an overlap between the foreground layer 130 and the background layer 110 for pictorial areas. These algorithms are less complicated than the ones in which background layer 110 holds the entire pictorial and background information, while the foreground layer 130 holds the text color information. Text in the three-layer MRC model is often jpeg compressed and hence offers poor image quality due to color fringing caused by jpeg compression. Black text is also not always pure black in the three-layer MRC model, since the foreground layer 130 takes colors from the original image.
An N-layer MRC model is also known, in which the foreground layer is separated into various independent sublayers based on color and spatial proximity of the pixels. Each of the foreground sublayers is generally binarized having a specific color. In the N-layer MRC model, more than one foreground sublayer may have the same color (e.g., mask 202 and mask 203 in FIG. 2 include red colored data). With reference to FIG. 2, an exploded view 210 of an N-layer MRC document 212 is illustrated, which comprises a number of layers, each of which in turn has a respective portion of the information to be graphically displayed in the document 212. The exploded view 210 shows seven mask layers 214, individually labeled as masks 201-207, each of which is overlaid on a background layer 216 and any preceding mask layers.
For example, the background layer 216 may comprise background image information (e.g., images, shading, etc.), and may be a contone jpeg image. Mask 201 may comprise data (e.g., binary G4 data, or some other suitable data type) printed or otherwise presented in black. Mask 202 may comprise red-colored data, Mask 203 may comprise another red-colored data, Mask 204 may comprise pink-colored data, Mask 205 may comprise yellow-colored data, and Mask 206 may comprise brown-colored data. In this manner, colored masks are overlaid on each other to generate the MRC image of the document. Mask 207 may comprise status information related to the scanner and/or document. Status information may comprise, without being limited to, a scanner signature, scanner ID information, scan-to-file authentication information, metadata related to objects in the document image, etc. Object metadata may include, without being limited to, location, size, date, type, etc., of the objects in the document. For instance, object type may be contone, halftone, low-frequency, high-frequency, smooth, rough, graphics, color, neutral, or the like. The status information may be printed or otherwise imaged in the same color as the background color of the background layer 216, so that it is not visible to the human eye and does not cause undesirable artifacts when scanned. It will be appreciated that the MRC document is not limited to a background layer and seven mask layers, but rather that any suitable number of layers may be employed in conjunction with the various features presented herein.
It is contemplated that the N-layer MRC model discussed above is not limited to color planes shown in FIG. 2, for example, red color plane, pink color plane, brown color plane, black color plane, and yellow color plane, but may also include a plane or a mask layer that may record any color.
As noted above, in the three-layer MRC model, the color of text is extracted in the foreground layer and is compressed using standard contone compression schemes (such as JPEG). The algorithms used in the three-layer MRC model to extract the color of text are relatively simple but compression of the foreground plane using any contone compression scheme may lead to image quality defects (i.e., non-uniformity and other artifacts) in text areas. For example, when the image quality is at a low value, the quality of text may deteriorate at the expense of improved file size.
In the N-layer MRC model, the text region is extracted from the original image. In most cases, the N-layer MRC models fail to extract text-on-tint information and as a result the pictorial region and in most cases text-on-tint information are placed in the background layer and are compressed using standard contone compression schemes (such as JPEG). The color of text is recorded separately in the binary layers and is compressed using any of the binary compression schemes (such as G4). Sometimes, these binary compression schemes (such as G4) may not include any loss associated with them The N-layer model, however, generally involves more complex algorithms having a lot of computations to separate image and text, and does not handle text-on-tint information well.
Generally, the N-layer MRC models (i.e., including the binary foreground layers) do not extract all of the text features from an image. For example, while some of the currently existing N-layer MRC models are configured to extract only the text features on white background areas, other N-layer MRC models are configured to extract text features from both white background areas and the light colored (e.g., yellow colored) background areas (e.g., newspaper). These N-layer MRC models generally fail to extract text on the darker background areas as well as text on the shadow areas of the image. In other words, as noted above, the N-layer MRC models do not handle text-on-tint information well. Moreover, some of these N-layer MRC models also fail to extract large colored text even in white background areas.
The proposed method retains the inherent advantage of the three-layer MRC model in text-on-tint areas, and that of N-layer MRC model in text-in-white area or text-in-background area. Thus, the proposed method produces better quality output than the original three-layer MRC model and the original N-layer MRC model, for example, in text areas.