1. Technical Field
The invention disclosed broadly relates to data processing systems and methods and more particularly relates to techniques for the capture of character recognition information derived from scanned images of document forms.
2. Related Patents and Patent Applications
This patent application is related to the co-pending U.S. patent application, Ser. No. 07/870,129, filed Apr. 15, 1992, entitled "Data Processing System and Method for Sequentially Repairing Character Recognition Errors for Scanned Images of Document Forms," by T. S. Betts, V. M. Carras, L. B. Knecht, T. L. Paulson, and G. R. Anderson, the application being assigned to the IBM Corporation and incorporated herein by reference.
This patent application is also related to the co-pending U.S. patent application, Ser. No. 07/870,507, filed Apr. 17, 1992, entitled "Data Processing System and Method for Selecting Customized Character Recognition Processes and Coded Data Repair Processes for Scanned Images of Document Forms," by T. S. Betts, V. M. Carras, and L. B. Knecht, the application being assigned to the IBM Corporation and incorporated herein by reference.
This patent application is also related to the co-pending U.S. patent application, Ser. No. 07/573,942, filed Aug. 28, 1990, entitled "Method and Apparatus for Document Image Management in a Case Processing System," by M. R. Addink, T. Leyba, C. Y. Hu, A. W. Holmes, C. A. Till, and J. J. Mullen, the application being assigned to the IBM Corporation and incorporated herein by reference.
This patent application is also related to the co-pending U.S. patent application, Ser. No. 07/693,739, filed Apr. 30, 1991, entitled "Apparatus and Method of Operation for a Facsimile Subsystem in an Image Archiving System," by H. F. DeBruin, D. C. Bailey, J. T. Argenta, and H. M. Morris, the application being assigned to the IBM Corporation and incorporated herein by reference.
This patent application is also related to the co-pending U.S. patent application, Ser. No. 07/305,828, filed Feb. 2, 1989, entitled "A Computer Implemented Method for Automatic Extraction of Data From Printed Forms," by R. G. Casey and D. R. Ferguson, the application being assigned to the IBM Corporation and incorporated herein by reference.
This patent application is also related to the U.S. Pat. No. 4,992,650, entitled "Bar Code Recognition Using PC Software," by P. J. Somerville, the patent being assigned to the IBM Corporation and incorporated herein by reference.
This patent application is also related to the U.S. Pat. No. 5,058,185, entitled "Object Management and Delivery System Having Multiple Object Resolution Capability," by R. E. Probst, G. L. Youngs, D. Rajagopal, C. A. Parks, and H. M. Morris, the patent being assigned to the IBM Corporation and incorporated herein by reference.
This patent application is also related to the U.S. Pat. No. 5,093,911, entitled "Distributed Image Storage and Retrieval System," by R. E. Probst, G. L. Youngs, D. Rajagopal, and C. A. Parks, the patent being assigned to the IBM Corporation and incorporated herein by reference.
3. Background Art
Document forms used for the submission of business-related data can have a variety of layouts, even for a narrowly defined line of business. This makes the automatic reading of document forms a challenging task. The purpose of a document form is to isolate information relating to a particular subject matter category into a named field on the form. If the data which has been written on the form can be automatically found and automatically read, then it can be entered as an operand into a computer program designed to perform the business task for which the information was submitted.
Economies of scale can be attained by consolidating the data processing tasks for related lines of business. However, the number of subject matter categories for which data is required is most likely different for each respective business area. Where the related lines of business use document forms for the submission of data related to their respective businesses, the document forms are likely to have different numbers of fields, to be ordered in different sequences, to be arranged in different patterns and to be named with different category names for each respective business area.
An example of this is the insurance industry. An insurer may offer fire insurance, casualty insurance and health insurance. These related lines of business are likely to have their data processing tasks consolidated, for economies of scale. However, the claim forms submitted to the insurer must be different for each respective type of insurance, since the number subject matter categories required for submitted data are not likely to be the same.
As time goes on, existing document forms for a particular line of business will be revised, altering the layout of the form, the order of the fields, the number of fields, or the names of the fields.
What is needed is a means to freely generate new document forms which can be automatically processed, even though the order, arrangement, name and number of the fields on the forms are changed.