The present invention is directed to the art of digital image processing and, more particularly, to a method and apparatus for generating a context based adaptive weighting vector for use in single color segmentation of an RGB color image and will be described with particular reference thereto. However, it is to be understood that the present invention has broader application in many fields such as in generating a context based adaptive weighting vector for use in single channel segmentation of a wide variety of digital images and other digital information or data.
Segmentation plays an important role in the art of electronic image processing. Generally, segmentation is used to group pixels into regions to determine the composition of the image. Oftentimes, the various regions are used to separate the objects from the background in the image.
A wide range of segmentation algorithms and techniques have been proposed for generating a single segmentation channel from multiple channels derived from a digital image source in both academic and industrial settings. As an example, some simple algorithms use the chrominance information of the image for segmentation. Other techniques employ much more complicated methods and are, accordingly, costly to implement.
In many instances, however, a multi-channel digital image input signal is converted to a single segmentation channel using a simple fixed weighting algorithm or a fixed weight projection vector. The use of projection is one of the most common methods for creating a single channel from multiple channels. In projection, an inner product is calculated between the input video and a single predetermined direction. The single predetermined direction is essentially defined by the projection vector.
FIG. 1 is a diagrammatical illustration showing a prior art example of a segmentation system 10 for converting multiple digital image input channels 14, 16, and 18 to a single segmentation channel 22 using a fixed weighting or projection vector 20. As shown there, the digital input image is by way of example an RGB color image 12 including a red channel video signal 14, a green channel video signal 16, and a blue channel video signal 18. Each of the video signals, of course, is comprised of a plurality of image pixels that store values representative of an intensity or xe2x80x9camountxe2x80x9d of red, green, and blue color intensity in the color image 12.
As noted above, in segmentation by projection, an inner product is determined between the input video channels and a single predetermined direction or projection vector. In the example shown in FIG. 1, the composite video value Vin of each pixel in the RGB color image 12 can be represented by Vin=[Rin Gin Bin]xe2x80x2 where Rin, Gin, and Bin represent a two-dimensional array of pixel values forming the digital image at each of the red, green, and blue image input channels 14, 16, and 18, respectively. The video value of each pixel of the segmentation channel 22 is determined from Sv=Wxe2x80x2*Vin where W is a weighting vector W=[W1 W2 W3]xe2x80x2. Typically, in order to ensure that the output is limited eight bits when the input vectors are eight bit representations, the weighting vector is usually normalized by xcexa3iWi=1.
To give a hard example of the above algorithm used in the exemplary prior art segmentation system 10 shown in FIG. 1, the weighting vector W can be the transformation from RGB to Y space and take on the value of W=[0.253 0.684 0.063]xe2x80x2. Alternatively, the weighting vector W can be selected to be the simple projection vector W=[0 1 0]xe2x80x2. In the latter example, only a single channel (the green channel for an RGB image) of the three channel input image signal is projected into Y space by the inner product as the segmentation channel 22.
The above fixed weight method of projection works well on average with many documents because the fixed projection vector weights are carefully selected from a large representative digital image experience base. However, the segmentation by projection technique is susceptible to a major failure mode because variations in the image that are orthogonal to a chosen direction cannot be detected. As an example, if the original image is comprised of green halftones formed by alternating white and green areas arranged on a page, and if the segmentation channel is chosen as a projection of the image onto green, the result will show an absence of variation in the segmentation channel. In the projection, both white and green have the same green value. In that sense, the projection vector W=[0 1 0]xe2x80x2 points in a direction that contains no change i.e. the green channel. This is a major shortcoming because in segmentation, the goal is to find change in the digital image input signal. Generally, the most accurate segmentation is derived from directions in the input signal having the most activity.
Alternatives to the above approach have been suggested including the use of modified sets of fixed values in the weighting vector. However, the above problem remains. As an example, if the weighting vector is selected as W=[⅓ ⅓ ⅓]xe2x80x2, then the variations in single color halftones are only ⅓ that of grey halftones. This wide range of variations makes halftone/color text detection difficult with only a single set of fixed segmentation parameters.
It would therefore be desirable to provide a system that is an improvement over fixed weighting type segmentation schemes used in the prior art.
It would further be desirable to provide a method and apparatus that project multiple input image channels into a single segmentation channel using a weighting vector having dynamic adjustable weighting parameters.
It would further be desirable to provide a method and apparatus for single channel color image segmentation using local context based adaptive weighting. More particularly, preferably, the varying weightings of the projection vector are determined as a function of local context input image activity. In that way, the projection vector will always point in a direction in the input image having the greatest level of activity or change. This has the advantage of providing larger signal variation in all types of color images thus increasing correct detection of halftones and text and improving the overall performance of segmentation.
In accordance with the invention, there is provided a method and apparatus for single channel color image segmentation using local context based adaptive weighting. An adaptive weighting vector is generated and applied to each pixel of a multi-channel color input image to generate a single segmentation channel from the plurality of color separation channels forming the input image. The adaptive weighting vector includes dynamically adjustable weighting parameters that vary as a function of local context activity in the input image and, therefore, always points to a direction in the input image having the greatest level of activity or change. This has the advantage of providing larger signal variation in all types of color images thus enhancing correct detection of halftones and text while improving the overall performance of segmentation.
The subject segmentation system obtains activity estimate representations of a measure of local channel signal variation at each color channel of the input image. For each pixel (i,k) of the image, the activity estimate representations of each color channel are compared relative to each other to identify a one of the multiple channels as having the greatest activity. A set of binary maps are generated for each of the channels in the input image for storing a first binary value for pixel locations where the greatest activity in the input image is found and for storing a second binary value for those pixel locations where the input channels did not have the greatest activity. The binary maps are filtered and stored in a corresponding set of filtered channel binary maps. An adaptive weighting vector is generated by combining the plurality of binary filtered maps according to a predetermined algorithm so that the weighting vector changes rapidly for projection of the input image into a single channel without loss of information.
It is a primary object of the invention to provide a system for generating an adaptive weighting vector W(i,k) by combining a plurality of low pass filtered binary maps representative of local context activity levels in each of the image channels so that the input image is projected onto a single segmentation channel using a rapidly changing projection vector for optimizing the segmentation and thus enhancing the accuracy of subsequent image object classification.
These and other objects, advantages, and benefits of the invention will become apparent to those skilled in the art upon a reading and understanding of the following detailed description.