Optical scanners are popular peripheral devices for computers. Optical scanners are used to take objects containing printed information (such as text, illustrations or photographs) and convert the information into a digital form that a computer can use. In general, a user places objects to be scanned onto a platen of the scanner. A scanner head is passed over the platen area and the resultant image is divided into a plurality of pixels. Each pixel location is assigned a value that is dependent on the color of the pixel. The resulting matrix of bits (called a bit map) can then be stored in a file, displayed on a monitor, and manipulated by software applications. The resulting scanned image contains both data pixels, which are pixels that are located on the objects, and background pixels, which are pixels that are the color of the background. Typically, the background color is the color of the lid of the scanner.
There are several applications where it is critical to correctly and accurately know the background color. These applications include single object segmentation, multiple object segmentation, and reorienting (or de-skewing) scanned objects. One problem, however, is that the background color is rarely known and must be estimated. In the above applications and many others, it is important to obtain an accurate estimate of the background color for the particular technique to work. If background color is estimated incorrectly, the entire algorithm fails.
By way of example, one application where estimating the background color is essential to the success of the technique is the detection and extraction of objects in scanned images. Such a technique is described in U.S. Ser. No. 10/354,500 by Herley entitled “System and method for automatically detecting and extracting objects in digital image data” filed on Jan. 29, 2003. This particular object detection and extraction system searches for gaps in the histograms of rows and columns of a scanned image containing multiple objects. A gap means that there are no data pixels going across that row or column of the image. These gaps are found by classifying pixels as either data pixels or background pixels and repeatedly decomposing the image into a case with a single object and a background. Once the decomposition is complete, the single object case can easily be solved. Gaps are determined by taking profiles of a histogram. A data pixel is defined as a pixel that differs by at least a threshold from the background color. In order to correctly find the gaps, the background color needs to be accurately estimated.
One way to estimate background color is to take a global histogram and find the color having the most pixels and call that color the background color. However, there are certain instances when this approach does not work. For example, suppose that the scanner background is black and the user places pictures containing a lot of white (such as photographs from a ski trip) such that they take up most of the scanning bed. In this case, there will a great deal more white pixels than black pixels, and this approach will select white as the background color. Thus, although this approach is simple and often works, there are cases where it fails to correctly estimate the background color.
For the above object detection and extraction system, another way to estimate the background color is to take a global histogram and try two or three different colors representing the most frequently-occurring color, second most frequently-occurring color, and third most frequently-occurring color, and so forth. Each color then is used in the detection and extraction process to determine which works best. However, the problem with this approach is that it is wasteful in both time and computational expense. Therefore, what is needed is an accurate technique for estimating background color in a scanned image.