The present invention relates to layered decomposition of images.
The large size of digital data files required to represent images, makes data compression an imperative when storing or transmitting images. On the other hand, compression can be problematic because many images comprise a combination of text, line-art graphics elements, and photographic elements and compression processes are commonly designed to be more effective with one type of image element than another. For example, the JPEG (Joint Photographic Experts Group) standard (ISO 10918) is designed to effectively compress the complex multi-color matrix of photographic elements. Annoying artifacts can appear in decompressed images, especially in the vicinity of sharp transitions which are common characteristics of graphical and textual elements. On the other hand, the compression process of the JBIG (Joint Bilevel Image Group) standard (ISO/IEC 11544:1993) utilizes arithmetic encoding and is particularly effective in compressing text and graphics but less effective in compressing natural photographic elements.
One method for improving the efficiency and results of image compression decomposes compound images into layers containing a type or types of elements that are effectively compressed using a single process. The data of each layer is then compressed with a process that is particularly effective with the type of data contained in the layer. The DRAFT ITU-T RECOMMENDATION T.44 “MIXED RASTER CONTENT (MRC),” International Telecommunication Union (ITU), Telecommunication Standardization Sector, October 1997, incorporated herein by reference, specifies the technical features of an imaging format based on segmentation of images or pages into multiple layers (planes) according to the type of image element and the application of encoding, spatial and color resolution processing specific to the type of image element comprising the layer. The ITU recommendation models a page or image as three layers; a background layer containing contone color (continuous tone and palletized color) element; a foreground layer containing text and line-art graphics, and a bi-level mask layer defining a relationship between the background and foreground layers. The mask is used to select the layer (background or foreground) from which a pixel will be rendered in the recomposed image. The pixels of the mask layer act as a bi-level switch to select a spatially corresponding pixel in the foreground layer or background layer. For example, if an exemplary mask layer pixel has a value of “1,” a spatially corresponding pixel might be selected from the background layer for rendering in the final image. However, if the mask layer pixel has a value of “0” the corresponding pixel would be selected from the foreground layer. While the ITU recommendation provides for processing, interchange, and archiving images in multiple layers, it does not provide a method of generating a mask layer to facilitate layered decomposition of an image.
L. Bottou et al. describe a mask generation method in HIGH QUALITY DOCUMENT COMPRESSION WITH “DjVu”, JOURNAL OF ELECTRONIC IMAGING, Vol. 7, pp 410–425, 1998. An image is partitioned into square blocks of pixels of different sizes. Two dominant colors are identified for the pixels of each block. Cluster initialization is inherited from the previous, lower resolution (larger) block size. The pixels of each block are sorted into clusters according to the closeness of their individual colors to one of the dominant colors of the block. An iterative, k-means algorithm is used to sort the pixels for clustering. The iterative nature of the process increases the computational resources and the processing time required for mask creation.
D. Huttenlocher et al. describe a decomposition process in DIGIPAPER: A VERSATILE COLOR DOCUMENT IMAGE REPRESENTATION, Proceedings of the IEEE, International Conference on Image Processing, Kobe, Japan, Oct. 24–25, 1999. The process utilizes token compression where a binary image is represented using a dictionary of token shapes and position information indicating where the token is to be drawn in the image. Segmentation of the image relies on attributes of text including the token representation of text as objects. As a result, the method is more effective with text than graphics.
What is desired, therefore, is method of layered image decomposition that is resource and time conservative.