The present invention relates to the art of image processing. It finds particular application in conjunction with digital image processing for file compression, and will be described with particular reference thereto. However, it is to be appreciated that the present invention is also amenable to other like applications.
In a modern office environment, it is common to have many documents digitally scanned, electronically created, stored, transmitted, printed and/or displayed. Typically, it is preferred that these operations be performed rapidly. Nevertheless, user expectations of quality are still often high. Digital implementation of a rapid high quality image path can be particularly formidable considering that a single page of a color document scanned at 600 spots per inch (spi) may be approximately 100 Megabytes in size. Consequently, practical systems for processing color or other sizable documents demand document compression methods that achieve high compression ratios with low distortion. “Document” images generally differ from “natural” images because they tend to contain well defined regions with distinct characteristics, such as text, graphics, continuous-tone pictures, halftone pictures and background. For example, typically, it is desired that text have a high spatial resolution for legibility, while high color resolution is often not required. Alternatively, continuous-tone pictures benefit from high color resolution, but can tolerate relatively lower spatial resolution. Therefore, it is desirable that a document compression algorithm be adaptive in order to meet different goals and exploit different types of redundancy among different image classes or types. Nevertheless, traditional compression algorithms, such as JPEG, are based on the assumption that an input image is spatially homogeneous, so they tend to perform poorly on document images.
A commonly used format in which images are represented for document or image compression is the known 3-layer model based foreground/mask/background representation for mixed raster content (MRC). Generally, the foreground layer contains the text and line graphics, and the background layer contains pictures and background. The mask is a binary image which determines, for each pixel in the digitized image, if the foreground pixel information or the background pixel information should be used. To apply the 3-layer MRC model, a document image is first segmented into foreground and background layers, and an appropriate mask is generated.
The subsequent performance of a document or image compression system or algorithm is directly related to the segmentation. With respect to document or image compression, an advantageous segmentation not only lowers the bit rate of the compressed image (i.e., the number of bits used to represent the compressed image per pixel in the uncompressed image), but also lowers the distortion in the reconstructed image. On the other hand, damaging artifacts are often caused by misclassifications in the segmentation. Generally, however, as the rate improves the distortion suffers, and as the distortion improves the rate suffers. This is known as the rate-distortion compromise. The optimal rate-distortion compromise is often a matter of individual preference or a function of particular constraints imposed by specific applications.
In any event, previously developed segmentation algorithms or systems, employing so called direct segmentation methods, typically compute or determine segmentation using only the input image or pixel data. They do not consider the properties of the subsequent compression technique applied, nor is the rate-distortion compromise desired by a user considered. That is to say, segmentation is not regulated by the ultimate outcome of the compression achieved. Rather, the input image or pixel data is classified for segmentation solely based upon a predetermined set of guidelines which determine classification from the data itself. For example, if based on the predetermined guidelines a region of a document is determined to contain text, then segmentation into foreground and background layers and generation of a mask layer for a 3-layer MRC model would be carried out accordingly regardless of the ultimate effect that segmentation may have on the subsequent compression.
Accordingly, the present invention contemplates a new and improved technique for document or image segmentation and compression which overcomes the above-referenced problems and others.