1. Field of the Invention
The present invention relates generally to image scanner technology and more specifically to a system and method for detecting photo regions in digital images.
2. Related Art
Digitizing image scanners are often used to digitize documents so they can be manipulated using a computer. For example, a scanner can be used to digitize text and put it into a format recognized by a word processing program so that an editor can rewrite or modify the document.
With advances in processor speed and scanner technology, it has become advantageous to also digitize pictorial images as well as printed text. In an effort to enhance scanner performance, developers have sought ways to optimize the number of bits used to represent a digital image. A simple reduction in the number of bits used to represent an image results in a reduction in image quality. An increase in the number of bits results in greater processor time required to digitize the image and perform subsequent image processing.
To optimize the number of bits used to represent an image, developers have sought to divide a scanned image into regions based on the number of graylevels (and hence the number of bits) required to adequately represent the image. Some regions, such as those containing a pictorial image for example, require a large number of graylevels to adequately represent the image. These regions are termed "photo regions." Other regions, such as those containing plain text or line art, do not require as many graylevels to effectively digitize the image. These regions are referred to as "non-photo regions."
To preserve image resolution, photo regions are represented with multiple number of bits per pixel (e.g., four or more). On the other hand, non-photo regions can typically be represented with one bit per pixel while still capturing the important information. For non photoregions, the one bit used to represent the pixel is set based on whether the graylevel scanned is above or below a specified graylevel threshold. In other words, a threshold graylevel is selected and any pixel darker than the graylevel is set to black and any pixel lighter than the threshold is set to white. If the image is to be printed on a bi-level output device such as a laser printer, then photo regions need to be half toned or even diffused to preserve their quality, whereas non-photo regions can typically be thresholded as discussed above.
Most conventional solutions, such as those used in the "page decomposition" or "page analysis" components of OCR (optical character recognition) products, start with a thresholded image (one bit per pixel) and can discriminate text from graphics. However, these conventional systems cannot give information accurately about the nature of the graphic (i.e., photo, line art).