The present invention relates to layered decomposition of images arid, more particularly, to the creation of a mask for layers of the digitally decomposed image.
The large size of digital data files required to represent images makes data compression an imperative when storing or transmitting images. On the other hand, compression can be problematic because many images comprise a combination of text, line-art graphics elements, and photographic elements and compression processes are commonly designed to be more effective with one type of image element than another. For example, the JPEG (Joint Photographic Experts Group) standard (ISO 10918) is designed to effectively compress the complex multi-color matrix of photographic elements. Annoying artifacts can appear in decompressed images, especially in the vicinity of sharp transitions which are common characteristics of graphical and textual elements. On the other hand, the compression process of the JBIG (Joint Bilevel Image Group) standard (ISO/IEC 11544:1993) utilizes arithmetic encoding and is particularly effective in compressing text and graphics but less effective in compressing natural photographic elements.
One method for improving the efficiency and results of image compression decomposes compound images into layers containing a type or types of elements that are effectively compressed using a single process. The data of each layer is then compressed with a process that is particularly effective with the type of data contained in the layer. The DRAFT ITU-T RECOMMENDATION T.44 xe2x80x9cMIXED RASTER CONTENT (MRC),xe2x80x9d International Telecommunication Union (ITU), Telecommunication Standardization Sector, October 1997, incorporated herein by reference, specifies the technical features of an imaging format based on segmentation of images or pages into multiple layers (planes) according to the type of image element and the application of encoding, spatial and color resolution processing specific to the type of image element comprising the layer. The ITU recommendation models a page or image as three layers; a background layer containing contone color(continuous tone and palletized color) element; a foreground layer containing text and line-art graphics, and a bi-level mask layer interposed between the background and foreground layers. The mask is used to select the layer (background or foreground) from which a pixel will be rendered in the recomposed image. The pixels of the mask layer act as a bi-level switch to select a spatially corresponding, pixel in the layer immediately above or below the mask layer. For example, if an exemplary mask layer pixel has a value of xe2x80x9c1,xe2x80x9d a spatially corresponding pixel might be selected from the background layer for rendering in the final image. However, if the mask layer pixel has. a value of xe2x80x9c0xe2x80x9d the corresponding pixel would be selected from the foreground layer. While the ITU recommendation provides for processing, interchange, and archiving images in multiple layers, it does not provide a method of generating a mask layer to facilitate layered decomposition of an image.
L. Bottou et al. describe a mask generation method in HIGH QUALITY DOCUMENT COMPRESSION WITH xe2x80x9cDiVuxe2x80x9d, JOURNAL OF ELECTRONIC IMAGING, Vol. 7, pp 410-425, 1998. An image is partitioned into square blocks of pixels of different sizes. Two dominant colors are identified for the pixels of each block. Cluster initialization is inherited from the previous, lower resolution (larger) block size. The pixels of each block are sorted into clusters according to the. closeness of their individual colors to one of the dominant colors of the block. An iterative, k-means algorithm is-used to sort the pixels for clustering. The iterative nature of the process increases the computational resources and the processing time required for mask creation.
D. Huttenlocher et al. describe a decomposition process in DIGIPAPER: A VERSATILE COLOR DOCUMENT IMAGE REPRESENTATION, Proceedings of the IEEE, International Conference on Image Processing, Kobe, Japan, October 24-25, 1999. The process utilizes token compression where a binary image is represented using a dictionary of token shapes and position information indicating where the token is to be drawn in the image. Segmentation of the image relies on attributes of text including the token representation of text as objects. As a result, the method is more effective with text than graphics.
What is desired, therefore, is method of layered image decomposition that is resource and time conservative and equally effective when decomposing a page or image into its text, graphical, and photographic elements.
The present invention overcomes the aforementioned drawbacks of the prior art by providing a method of generating a mask for a layered image decomposition comprising the steps of partitioning the image as a plurality of first sub-images and as a plurality of second sub-images dimensionally differing from the first sub-images, both the first and the second sub-images comprising pluralities of pixels; assigning a first sub-image mask value to an evaluation pixel according to a relationship of the luminance of the evaluation pixel and a sub-image luminance of the first sub-image; assigning a second sub-image mask value to the evaluation pixel according to a relationship of the luminance of the evaluation pixel and a sub-image luminance of the second sub-image; and setting a mask value for the evaluation pixel as a function of the first and second sub-image mask values for a plurality of pixels of the first and the second sub-images. The method is non-iterative and conserves computational resources and time which is important for on-line operations. Further, the method is equally effective for text, line art graphic, and photographic image elements.