This section provides a background or context to the invention recited in the claims. Unless otherwise indicated, what is described in this section is not prior art to the claims and is not admitted to be prior art by inclusion in this section.
U.S. Pat. No. 7,013,045 to Sommer et al. entitled “Using multiple documents to improve OCR accuracy” indicates that in many document imaging systems, large numbers of forms are scanned into a computer, which then processes the resultant document images to extract pertinent information. Typically, the forms comprise preprinted templates, containing predefined fields that have been filled in by hand or with machine-printed characters. Before extracting the information that has been filled into any given form, the computer must first know which field is which. Only then can the computer process the information that the form contains. The computer then reads the contents of the fields in the form, typically using methods of optical character recognition (OCR) and arranges the OCR results in a table or database record. In many of these imaging systems, it is crucial that the information in the forms be read out correctly. For this purpose, automated OCR is commonly followed by manual verification of the OCR results. Often, the computer that performs the OCR also generates a confidence rating for its reading of each character or group of characters. Human operators perform the verification step, either by reviewing all the fields in the original document, and correcting errors and rejects discovered in the OCR results, or by viewing and correcting only the characters or fields that have a low OCR confidence level. Since verification of the OCR is typically the most costly part of the process, it is generally desirable to attain the highest possible level of confidence in the automated processing phase, and thus to minimize the portion of the results that must be reviewed by a human operator.
Other documents describing the need for improved productivity of human operators in verifying OCR results include U.S. Pat. No. 6,351,574 to Yair et al. entitled “Interactive verification of OCRed characters” and U.S. Pat. No. 5,455,875 to Chevion et al entitled “System and method for correction of optical character recognition with display of image segments according to character data.”