There have been proposed image matching techniques for comparing (i) image data obtained by reading a document by use of a scanner or the like with (ii) image data of a preliminarily stored reference document so as to determine a similarity between the image data and the preliminarily stored image data.
Examples of the method for determining a similarity include: a method in which a keyword is extracted from an image with OCR (Optical Character Reader) so as to carry out matching with the keyword; a method in which only a ruled line image having a ruled line is focused on as a target image, and matching is carried out with features of the ruled line (see Patent Document 1); and a method in which a similarity is determined based on color distributions of an input image and a storage image (see Patent Document 2).
Patent Document 3 discloses a technique in which a descriptor is formed from features of an input document, and matching between the input document and a document stored in a document database is carried out by use of the descriptor and a descriptor database in which descriptors are stored and which indicates a list of documents including features from which the descriptors are formed. A descriptor is selected such that the descriptor is invariant for distortions generated by digitalization of a document and differences between an input document and a document used for matching in a document database.
In this technique, when the descriptor database is scanned, votes for each document in the document database are accumulated, and a document having the maximum number of votes obtained or a document whose number of votes exceeds a threshold value is used as a matched document.
Further, Patent Document 4 discloses a technique in which a plurality of feature points are extracted from a digital image, a set of local feature points are determined out of the extracted feature points, a partial set of feature points is selected out of the determined set of local feature points, invariants relative to geometric transformation each as a value characterizing the selected partial set are calculated in accordance with plural combinations of feature points in the partial set, features are calculated from combinations of each of the calculated invariants, and a document and an image corresponding to the digital image data is searched by voting documents and images having the calculated features stored in a database.
Conventionally, in an image data output processing apparatus, e.g., a copying machine, a facsimile device, a scanning device, or a multi-function printer, which carries out, with respect to input image data (image data of a target document to be matched), an output process such as a copying process, a transmitting process, an editing process, or a filing process, when it is determined that an input image of a target document is similar to an image of a reference document by use of such the image matching techniques, its output process is controlled.
For example, there has been known techniques of a color image forming apparatus as anti-counterfeit techniques with respect to a paper currency or a valuable stock certificate, in which it is determined whether or not input image data is identical with an image of a paper currency or a valuable stock certificate in accordance with a pattern detected from the input image data, and when it is determined that the input image data is identical with a reference image, (i) a specified pattern is added to an output image so that an image forming apparatus that has made a copy of the image data can be specified from the output image, (ii) a copied image is blacked out, or (iii) a copying operation is prohibited with respect to the input image data.    Patent Document 1: Japanese Unexamined Patent Publication, Tokukaihei, No. 8-255236 (published on Oct. 1, 1996)    Patent Document 2: Japanese Unexamined Patent Publication, Tokukaihei, No. 5-110815 (published on Apr. 30, 1993)    Patent Document 3: Japanese Unexamined Patent Publication, Tokukaihei, No. 7-282088 (published on Oct. 27, 1995)    Patent Document 4: International Publication No. WO 2006/092957, pamphlet (published on Sep. 8, 2006)
However, such a conventional image matching apparatus has a problem in which, in a case where a target document is an N-up document or a reduced-size document, it is difficult to precisely determine a similarity to a reference document. The following describes the problem in detail.
In a case where matching is carried out for an input image by use of features of the images so that a similarity between the input image and a reference image is determined, resolutions of the images are set to the same, so that the determination is most precisely carried out. This is because, when the resolutions of the images are set to the same, it is possible to extract features, which are used for determination, from the images under the same condition.
However, in a case of an N-up document on which a plurality of document images are laid out, a size of each document image laid out on the N-up document is reduced from its original size. From this reason, even if the document images are read out at the same resolution as those of reference documents, a substantial resolution thereof differs.
For example, image data read out from an A4-size document at 600 dpi is compared with image data read out similarly at 600 dpi from an A4-size N-up document on which two images respectively read out from A4-size documents are reduced in size and laid out.
In such an N-up document, i.e., in a case where two A4-size documents are reduced in size and laid out on one A4-size document, two document images laid out on the 2-up document are reduced in size to about 0.7 times their original sizes. When the 2-up document is read out at the same resolution as those of their original images that are not reduced in size, namely, the 2-up document is read out at 600 dpi in this case, a substantial resolution of each document image is around 420 dpi.
If resolutions are different, the following problem, for example, arises. Two components adjacent to each other in an image are recognized as two connected regions at a high resolution, whereas the components are recognized as one connected region at a low resolution, thereby resulting in that features extracted from the images are different.
On the other hand, some conventional image matching apparatuses are arranged such that a resolution of image data is converted into a uniform default resolution before features are extracted. This aims at reducing processes required for extraction of features by reducing the resolution.
However, these apparatuses assume nothing about such cases where an image of a target document is reduced in size from its original size, and are arranged so that resolutions of any input image data are converted into a default resolution. On this account, it is difficult to compensate a difference of a substantial resolution of such a reduced-size image, and to solve the problem of a decrease in determination accuracy.