The present invention relates to a region-based binarization system for a mixed type document which provides for optimal binary image quality.
A printed page in a magazine often contains photographs mixed with text, line art and graphics. When the page is electronically captured by a scanner, a binarization process is required to convert the captured grey scale image into a bitonal representation of the image at output. There are two common classes of image binarization techniques. One is called an adaptive thresholding technique which is good for the type of documents that mainly contain text and line art. The other is a dither or error diffussion technique which reproduces shades of gray in a form of a binary format. It is effective in binarizing photographic images. In the case of a mixed type of document where text and photographs are contained in the captured document image, either of the two binarization methods cannot produce satisfactory image quality in both text and photographs. A well-known solution to the problem is to segment the captured digital image into regions of photographs and text so that different binary processes can be applied to different regions in order to get optimal image quality.
A known segmentation method divides a mixed type of document into 4-by-4 blocks, classifies each block as text or image, and improves classification by eliminating short runs of blocks (see for example U.S. Pat. No. 4,668,995 to Chen et al.). After blocks of image lines are classified, the different binarization processes are then applied accordingly. Another known method segments an image by extracting run lengths for each scanline, constructing rectangles from the run lengths, then classifying rectangles as either text or non-text, finally merging associated text blocks into text regions (see for example, U.S. Pat. No. 5,335,290 to Cullen et al.).
The two segmentation methods mentioned above are bottom-up segmentation methods which start with pixel-by-pixel or small block-by-block segments of information and expand into regions. They are less robust and prone to classification errors because text or non-text classification is based on local image information only.
An objective of the present invention is to provide for a top-down segmentation method which locates photographic regions based on global pixel connectivity and proposes a region-based binarization system which uses a segmentation result to obtain optimal binary image quality.
The present invention is related to a region-based binarization system which applies adaptive thresholding and image rendering such as error diffusion (or dither) individually to generate two binary images from a grey scale image; detects the location of photographic images in the low resolution image; identifies the photographic images having a rectangular shape or boundary; generates a classification bitmap which marks a photographic pixel as xe2x80x9c1xe2x80x9d vs. a non-photographic pixel as xe2x80x9c0xe2x80x9d; and composes the final binary image based on the classification map from the two stored binary images.
The photographic detection process comprises the steps of converting the low resolution grey scale image into a binary image using a global thresholding; performing a binary image erosion process to remove thin lines and the majority of characters; applying connected component analysis to locate the objects; and using a size filter to exclude small objects. The locations of the large objects are considered as the locations of photographs.
The present invention relates to a region-based binarization process which comprises the steps of: converting a gray scale image into first and second binary images; detecting a location of photographic images in the gray scale image; identifying photographic images of the detected photographic images which have a rectangular boundary; generating a classification map which distinguishes pixels in the photographic images having a rectangular boundary from remaining pixels; and forming a final binary image from the first and second binary images based on the classification map.
The present invention further relates to a region-based binarization process which comprises the steps of: capturing an image; detecting a location of photographic images in the captured image; identifying photographic images of the detected photographic images which have a rectangular boundary; generating a classification map which distinguishes photographic pixels in the photographic images having a rectangular boundary from non-photographic pixels; and forming a final binary image based on the classification map.
The present invention further relates to an image capture assembly which comprises: an image capture section which captures an image; a conversion section which converts the captured image into digital image information indicative of the captured image; and a processing section which processes the digital image information to detect a location of photographic images in the captured image, identifies photographic images of the detected photographic images which have a rectangular boundary, and generates a classification map which distinguishes pixels in the photographic images having a rectangular boundary from remaining pixels.