The present invention relates generally to the electronic processing of images and is more particularly directed to the processing of document images that may include printed or handwritten text overlying background markings such as graphics, decorative or security patterns, or distracting blemishes.
In many document-processing applications the images of the documents to be processed are electronically captured and presented to operators at workstations for data entry or are subjected to automatic processing, such as optical character recognition, directly from the electronic images. The images may be archived on magnetic or optical media and subsequently retrieved and displayed or printed when needed. Such systems are used to process and archive a wide range of document types such as bank checks, credit card receipts and remittance documents as well as commercial documents such as purchase order forms and invoice forms.
Checks, for example, are processed in high volumes by capturing the images of the front and back sides of the checks on high-speed document transports. The images are then displayed at workstations where operators may enter the dollar amounts, verify signatures, reconcile inconsistencies and undertake other processing steps. Many financial institutions will then provide their account holders printouts showing small-scale black and white printed images of recently processed checks with the monthly account statements.
A problem arises in working with such digital images of checks or other documents. On a typical check the various data fields to be filled in with substantive information, such as payee, date, dollar amount, and authorizing signature of the payor, generally overlie a background picture or security pattern. Even form documents such as order forms may include fields with gray backgrounds to be filled in or may include hand stamps with a received date or sequence number which may be uneven in appearance or which may have inadvertently been placed over other markings, for example. In the digitally captured image of such checks or forms the substantive data fields are sometimes difficult to read because of interference from the digitally captured background image or pattern or even from paper blemishes, creases or obscuring smudges. Reduced-scale printouts of such images may be even harder to read.
Early systems for image processing of bank checks tried to eliminate the background picture or pattern altogether from the captured image of the check. Such early systems typically employed a thresholding technique to eliminate the background. Such techniques have not been entirely successful. They tend to leave behind residual black marks left over from the background image that interfere with the substantive information on the check and in some instances may even degrade the handwritten or printed textual matter on the check making it more difficult to read. In addition, it is sometimes desirable to retain some or all of the background picture, for example, to provide an archival copy of the original document. The problem here is that an insensitive threshold may avoid most, although generally not all, of the background but may miss some of the low-contrast text, whereas a more sensitive threshold may pick up most of the low-contrast text but more of the obscuring background, too.
Over the years various other approaches have been developed for handling background graphics in document images and either eliminating the background or reproducing it in a more readable fashion. Such other approaches may be seen for example in U.S. Pat. Nos. 4,853,970 and 5,600,732. See also the recent publication by S. Djeziri et al., entitled xe2x80x9cExtraction of Signatures from Check Background Based on a Filiformity Criterion,xe2x80x9d IEEE Transactions on Image Processing, Vol. 7, No. 10, October 1998, pp. 1425-1438, and references cited therein for general discussions of the field.
In particular, U.S. Pat. No. 4,853,970 discloses an approach in which the captured image of a document is first analyzed to find the edges of pictorial or text features present in the image. The edges separate other areas of light and dark over which the intensity varies more gradually, if at all. The image is then reconstructed by separately reconstructing the edges with an algorithm, referred to in U.S. Pat. No. 4,853,970 as a point algorithm or point operator, that is adapted to give good representation of the image where edges are located and reconstructing the expanses of gradual intensity variation with an algorithm, referred to in U.S. Pat. No. 4,853,970 as a level algorithm or level operator, that is appropriate for such gradual variations. For example, a thresholding algorithm with very insensitive threshold could be used for the second algorithm if it is desired to minimize the background or a digital half-toning algorithm could be used to give a good representation of pictorial graphics without compromising the textual matter, which is composed primarily of characters that have strong edges.
Notwithstanding the benefits of this method, it may nevertheless represent a compromise in the clarity and readability of the original document
The present invention provides a method for processing a digital image of a document to provide one or more bitonal digital copies of selected image quality that are optimized for specific purposes. The method is especially suited for image processing of such documents as bank checks and commercial forms that tend to have printed or handwritten textual characters overlaid on a wide variety of potentially obscuring background markings, security patterns or decorative pictures as well as paper blemishes. The method may provide a so-called clean bitonal digital copy, for example, that is optimized for use in optical character recognition where it is desired to reduce as much as possible the obscuring influences of the background, or a so-called archival copy that is optimized for archiving the original document, where it is desired to retain meaningful background along with the text but eliminate background xe2x80x9cnoise.xe2x80x9d
Briefly, a digital image of a document is processed in accord with the invention by deriving a set of bitonal copies of the digital image covering a range of contrast sensitivities. At least three bitonal copies are required although in general it will be desirable to use more than three. The copies are compared pairwise and for each pair a numerical measure is provided representing the difference between the two copies making up the pair. The copies are compared in the order of their respective contrast sensitivities. That is, the copies in each pairwise comparison correspond to adjacent contrast sensitivities when the contrast sensitivities are arranged monotonically to form, for example, a decreasing sequence. The collection of such numerical measures defines a numerical sequence, referred to herein as the image-variation sequence, representing variations in the digital image as the contrast sensitivity is decreased. The numerical image-variation sequence is then analyzed according to prescribed criteria to derive one or more optimizing contrast sensitivities, which are then used to generate one or more optimized bitonal digital copies of the digital image at the optimizing contrast sensitivities.
The image-variation measure quantifying the change from one bitonal copy to the next may take a variety of forms. In one embodiment it is determined from a pixel-by-pixel comparison of two bitonal images. The measure of the change here is taken simply as the number of pixels at which the two images differ. This simple measure is desirable in that it is simple to implement and it gives surprisingly good results over a wide range of document types. Other measures may also be employed that may be adapted to particular kinds of documents or have other beneficial characteristics in certain applications. Another embodiment of such a measure is based on an analysis of the pattern of black and white pixels in a local neighborhood of each pixel in the image under examination. A numerical measure is derived from the differences in the frequency of occurrence of certain neighborhood patterns between the two bitonal images under comparison.
The image-variation sequence is automatically analyzed to find the region or regions of rapid dropoff, generally signifying that certain unwanted or wanted features of the image are dropping out, and to find regions of more gradual changes. These regions may be visualized as a downward slope from a peak and a gradually decreasing tail if the sequence is plotted on a graph. The optimizing values of the contrast sensitivities are selected to fall in select positions with respect to the peak or peaks and tails that are found in the sequence.
It is an advantage of the present method that it is simple to implement. It is a further advantage that it may be used as a supplement to other document image processing methods to fine-tune the results.
Other aspects, advantages, and novel features of the invention are described below or will be readily apparent to those skilled in the art from the following specifications and drawings of illustrative embodiments.