There have been proposed various techniques for comparing (i) input image data obtained by a scanner reading a document image with (ii) a preliminarily stored image so as to determine a similarity between the input image data and the preliminarily stored image.
Examples of the method for determining a similarity include: a method in which a keyword is extracted from a text image with OCR (Optical Character Reader) so as to carry out matching with the keyword; and a method in which features of a ruled line included in an image are extracted so as to carry out matching with the features.
Further, Patent Document 1 (Japanese Unexamined Patent Publication No. Tokukaihei 8-255236 (published on Oct. 1, 1996)) discloses a technique in which texts, frames for text strings, frames etc. are recognized from an input image and matching is performed with respect to each frame based on frame information, thereby performing a format recognition of a ruled line image etc.
Further, Patent Document 2 (International Publication No. WO 2006/092957A1, pamphlet (published on Sep. 8, 2006) discloses a technique in which a centroid of a word in an English document, a centroid of a connected component of a black pixel, a closed space of a kanji character, a specific portion repeatedly appearing in an image etc. are extracted as feature points, a set of local feature points are determined out of the extracted feature points, a partial set of feature points is selected out of the determined set of local feature points, invariants relative to geometric transformation each as a value characterizing the selected partial set are calculated in accordance with plural combinations of feature points in the partial set, the calculated invariants are regarded as features, and a document matching is performed in accordance with the features.
However, the techniques of Patent Documents 1 and 2 have a problem that in a case where input image data is data having been subjected to a process such as enlarging and reducing, features cannot be extracted with high accuracy.
For example, in the technique of Patent Document 1, the results of recognition of texts, frames for text strings, frames etc. vary according to the influences of enlarging, reducing etc., and consequently it is impossible to perform a format recognition with high accuracy.
Further, in the technique of Patent Document 2, the results of extracting a centroid of a word in an English document, a centroid of a connected component of a black pixel, a closed space of a kanji character, a specific portion repeatedly appearing in an image etc. vary according to the influences of enlarging, reducing etc., and consequently accuracy in document matching drops.
In a case where a feature point is extracted from an image including a handwritten text (e.g. an image of a document which was printed in a predetermined font and on which a handwritten note is written), the techniques of Patent Documents 1 and 2 are particularly likely to make an erroneous determination, because the techniques has lower determination accuracy due to the enlarging, reducing etc. as well as because a handwritten text is greatly different from the shape of a font stored in an image processing apparatus.