1. Technical Field
The present invention relates to: an image processing method and an image processing apparatus for judging whether an inputted image is similar to a preliminary reference image or not on the basis of features obtained from the inputted image; an image forming apparatus and an image reading apparatus employing the image processing apparatus; and a memory product which records a computer program for realizing the image processing apparatus.
2. Description of Related Art
Proposed as an image processing for matching image data obtained by reading a document with a scanner with a predetermined image data stored in advance so as to judge the similarity of the images are, for example, a method for extracting keywords from an image with an OCR (Optical Character Reader) and judging the similarity of images on the basis of the extracted keywords. In another method, documents where similarity judgment is to be performed are limited to sheet forms containing ruled lines and then features of the ruled lines are extracted so that similarity of the image is judged.
Further, proposed is a matching apparatus for extracting features of an input document to generate a descriptor and matching the generated descriptor with descriptors stored in advance in a descriptor database, so as to perform matching of the input document and the descriptor in the descriptor database (see Japanese Patent Application Laid-Open No. H7-282088).
In the device disclosed in Japanese Patent Application Laid-Open No. H7-282088, descriptors and a list of documents including the features which the respective descriptors are generated, are stored in the descriptor database. The descriptors are generated to be unchanged by distortion caused by digitalization of a document, a difference between an input document and a matching document in a document database, and the like. The device disclosed in Japanese Patent Application Laid-Open No. H7-282088 accumulates votes for the respective documents in the document database when the descriptor database is scanned, and determines that one document obtaining the largest number of votes or a document obtaining the number of votes, which exceeds a threshold, is a matching document.
Also proposed is a device, which is used for an image taken by a digital camera, an image read by a scanner or the like, for obtaining the centroid of a connected part of an image as a feature point assuming that the connected part is a word component, calculating a geometric invariant using said feature point, further obtaining features from the geometric invariant, storing the features, an index representing the feature point and an index representing the image in a hash table, obtaining a feature point, the features and an index representing the feature point from an input image (query) by a similar process in retrieval of an image, and voting for an image which is stored in the hash table in advance so as to carry out retrieval (see International Publication Pamphlet No. 2006/92957).
Described in the International Publication Pamphlet No. 2006/92957 are to extract feature points from a wide range for use in calculation of one features, to increase the number of feature points, and the like in order to improve the accuracy of the features. Moreover, in International Publication Pamphlet No. 2006/92957, it is possible to restrain reduction of the accuracy of determination of image retrieval by preliminarily recording correspondence of feature points in an input image and points in a reference document and not voting for a point, which has already been associated, in voting for an image stored in advance, so as to prevent incorrect voting. Also described in International Publication Pamphlet No. 2006/92957 is that the accuracy of determination of image retrieval decreases when more pages of images are stored in the hash table and it is assumed that the reason thereof is an increase in the chance for storage of a different document having the same features.