1. Technical Field
The invention disclosed broadly relates to data processing and more particularly relates to character recognition of document forms.
2. Related Patents and Patent Applications
This patent application is related to the copending U.S. patent application, Ser. No. 07/870,129, filed Apr. 15, 1992, entitled "Data Processing System and Method for Sequentially Repairing Character Recognition Errors for Scanned Images of Document Forms," by T. S. Betts, et al., the application being assigned to the IBM Corporation and incorporated herein by reference.
This patent application is also related to the copending U.S. patent application, Ser. No. 07/870,507, filed Apr. 17, 1992, entitled "Data Processing System and Method for Selecting Customized Character Recognition Processes and Coded Data Repair Processes for Scanned Images of Document Forms," by T. S. Betts, et al., the application being assigned to the IBM Corporation and incorporated herein by reference.
This patent application is also related to U.S. Pat. No. 5,140,650, Ser. No. 07/305,828, entitled "A Computer Implemented Method for Automatic Extraction of Data From Printed Forms," by R. G. Casey, et al., the patent being assigned to the IBM Corporation and incorporated herein by reference.
This patent application is also related to the copending U.S. patent application, Ser. No. 08/051,972, filed Apr. 26, 1993, entitled "System and Method for Enhanced Character recognition Accuracy by Adaptive Probability Weighting," by M. P. T. Bradley, the application being assigned to the IBM Corporation and incorporated herein by reference.
This patent application is also related to the copending U. S. patent application by D. W. Billings, et al. entitled "Method for Defining a Plurality of Form Definition Data Sets," Ser. No. 08/100,846, filed Aug. 2, 1993, the application being assigned to the IBM Corporation and incorporated herein by reference.
3. Background Art
Data contained in digitized images can be extracted for a number of purposes, and in many different ways. A prerequisite for extracting information from a form is a knowledge of the types and locations of the data (information about the "fields" of the form). Currently, most forms processing applications have their own method for "defining" forms, and each method is incompatible with the others. In large image systems which use several different forms processing applications, each form needs to be separately defined for each application, which costs time and introduces inconsistencies in the form definitions. The method disclosed in the copending D. W. Billings, et al. patent application creates form definition data sets which can be used for almost any forms processing application.
In many business applications, the volume of forms to be processed can vary widely with time. When a large number of document form images are received by the system in a short interval, it can drastically increase the time required for character recognizing the forms.