There have been proposed various techniques for comparing (i) input image data obtained by a scanner reading a document image with (ii) a preliminarily stored image so as to determine a similarity between the input image data and the preliminarily stored image.
Examples of the method for determining a similarity include: a method in which a text image is extracted, a keyword is extracted from the text image with OCR (Optical Character Reader) so as to carry out matching with the keyword; and a method in which features of a ruled line included in an image are extracted so as to carry out matching with the features.
Further, Patent Document 1 (Japanese Unexamined Patent Publication No. Tokukaihei 8-255236 (published on Oct. 1, 1996)) discloses a technique in which texts, frames for text strings, frames etc. are recognized from an input image and matching is performed with respect to each frame based on frame information, thereby performing a format recognition of a ruled line image etc.
Further, Patent Document 2 (International Publication No. WO 2006/092957A1, pamphlet (published on Sep. 8, 2006) discloses a technique in which a centroid of a word in an English document, a centroid of a connected component of a black pixel, a closed space of a kanji character, a specific portion repeatedly appearing in an image etc. are extracted as feature points, a set of local feature points are determined out of the extracted feature points, a partial set of feature points is selected out of the determined set of local feature points, invariants relative to geometric transformation each as a value characterizing the selected partial set are calculated in accordance with plural combinations of feature points in the partial set, the calculated invariants are regarded as features, and a document matching is performed in accordance with the features.
However, the technique disclosed in Patent Document 1 requires a complex process and a long processing time since it is necessary to extract plural kinds of elements such as texts, frames for text strings, lines indicative of frames etc. from an input image and to perform matching with respect to each of the extracted elements.
Further, in a case where a centroid of a text is regarded as a feature point as in the technique of Patent Document 2, a document with a small amount of texts has a small amount of extracted feature points, resulting in low accuracy in document matching.