The present invention relates to document scanners and, more particularly, to a method and apparatus for functionally transforming a color image into a gray scale image in which the specified color is eliminated in business documents produced therefrom.
A typical business form contains data represented by a finite number of colors. For example, these forms usually have lines, pre-printed text and background colors all of which are light reflecting colors (non-black), and black-colored informational text input by the user. When these forms are scanned by flatbed scanners or rotary-type scanners, undesirable extra colors are reproduced on the edges of the lines and text. These false colors are generated in the scanning process due to chromatic aberration and physical misregistration of RGB signals. These extra colors on the edges are referred to in the art as color fringes, which obviously do not exist in the original business form document.
For automatic document indexing, only the written text is interested and processed by optical character recognition (OCR). Pre-printed lines, background colors and the like on the form are deleted before OCR for minimizing interference of lines in OCR in order to yield better OCR read rate, referred to in the art as color dropout. In the prior art, the occurrence of color fringes results in errors during color dropout due to incorrect color classification. For example, due to color fringes or the written text overwriting on the pre-printed color lines, the color of a pixel on an edge to be retained may be substantially identical to the color of interest to be deleted by dropout. These extra colors generated in the scanning process make it is difficult to achieve error-free color dropout without losing some edge pixels of image objects to be retained.
Consequently, a need exists for a color dropout technique that retains image information while completely eliminating a specified color.
The present invention is directed to overcoming one or more of the problems set forth above. Briefly summarized, according to one aspect of the present invention, the invention resides in a method for performing color dropout on a digitized document, the method comprises the steps of (a) obtaining color values from the digitized document for background and a color of interest to a user; (b) transforming the color values of the background and the color of interest into an identical gray scale value according to a color dropout function; (c) obtaining a gray scale value different from the identical gray scale value for the remaining portion of the image according to the color dropout function; and (d) thresholding the gray scale values obtained from steps (b) and (c) to obtain a binary image that substantially eliminates color classification error which retains character integrity by a functional transformation that suppresses color fringe artifacts.
These and other aspects, objects, features and advantages of the present invention will be more clearly understood and appreciated from a review of the following detailed description of the preferred embodiments and appended claims, and by reference to the accompanying drawings.