1. Field of the Invention
The present invention relates to a data processing apparatus, an image processing apparatus, a data processing method, an image processing method, and programs for implementing the methods suitable for converting images produced by scanning paper forms using a scanning device and/or documents received by a facsimile (hereinafter simply “fax”) machine into electronic forms.
2. Description of the Related Art
Conventionally, in the most common method of converting paper forms into electronic forms, a scanning device scans the paper forms to thereby obtain image data thereof, and transmits the obtained image data to a computer in which the image data is processed into electronic form.
Conventionally, there have been known methods of identifying a form type of image data obtained from a form, in which the form type is identified via a comparison with form types registered in advance using pattern matching of images or the like, or via recognition of a barcode embedded beforehand in a portion of the form so as to be recognizable from image data of the form. Since the business process carried out subsequent to the identification of the form type differs depending on the form type, it is important to correctly sort image data according to form type.
As for the form recognition, a method of extracting a characteristic amount from image data of a form and calculating a degree of similarity of the form to a registered form has been conventionally proposed (see for example Japanese Laid-Open Patent Publication (Kokai) Nos. 2000-285187 and 2000-293596).
However, there is a problem that the conventional form type identifying methods do not have high recognition accuracy so that depending on the quality of image data obtained by scanning, the form type may be erroneously recognized as a similar but different form type or on occasion may be recognized as an unclear form.
Out of the conventional form type identifying methods, the method in which barcodes are used has a premise of using barcodes, and since time and effort are required to newly provide forms with embedded barcodes, there is a problem of not necessarily being able to satisfy user demands for the electronization of existing paper forms.
In addition, according to the conventional form type identifying methods, since a computer receives image data obtained by scanning a form from a scanner and then carries out an identifying process for the form type, there is a problem that the processing load of the computer is large.