There have been proposed techniques for comparing (i) image data obtained by scanning a document image with a scanner with (ii) preliminarily stored image data so as to determine a similarity between the image data and the preliminarily stored image data.
Examples of a method for determining a similarity include: a method in which a keyword is extracted from an image with OCR (Optical Character Reader) etc. so as to carry out matching with the keyword; a method in which matching is carried out by extracting features of ruled lines included in an image; a method in which text strings etc. included in image data are replaced with points (feature points) and matching is carried out by use of features calculated based on positional relationships between these points.
Further, Patent literature 1 discloses a technique as follows: a descriptor is generated from features of an input document in such a manner as to be invariable to distortion caused by digitalization of a document or to a difference between the input document and a document used for matching in a document database. Then, matching between the input document and a document in the document database is performed using the descriptor and a descriptor database in which the descriptor is stored and which is indicative of a list of a document including features from which the descriptor is extracted. In the technique, when the descriptor database is scanned, votes for individual documents in the document database are accumulated, and a document with the largest number of votes obtained or a document whose number of votes obtained is over a certain threshold value is considered as a matching document.
Further, Patent literature 2 discloses a technique in which a centroid of a word in an English document, a centroid of a connected component of a black pixel, a closed space of a kanji character, a particular portion repeatedly appearing in an image etc. are extracted as feature points, a set of local feature points are determined out of the extracted feature points, a partial set of feature points is selected out of the determined set of local feature points, invariants relative to geometric transformation each as a value characterizing the selected partial set is calculated in accordance with plural combinations of feature points in the partial set, the calculated invariants are regarded as features, and a document matching is carried out based on the features thus obtained.
However, the above conventional techniques have a problem that when resolution of a matching key image is different from resolution of a matching reference image, it is impossible to accurately determine a similarity between the matching key image and the matching reference image.
The following explains the above problem in more detail. In the present specification, a “matching key image” indicates an image serving as a key in a matching process with respect to a preliminary stored image, and a “matching reference image” indicates an image to be retrieved when carrying out the matching process. For example, in a case of determining whether preliminary stored images include an image similar to a document image included in image data A, the document image included in the image data A is a matching key image and an image belonging to the preliminary stored images is a matching reference image. Further, in the present specification, a “reference image” indicates all images stored in an image processing apparatus or an image processing system. Therefore, in a case where a matching key image is an image stored in the image processing apparatus or the image processing system, the matching key image is also regarded as the “reference image”.
In a case of determining a similarity between images based on features of the images, matching the images with the same resolution realizes the highest accuracy in the determination. This is because the same resolution between two images allows features serving as a standard for the determination to be extracted from the images under the same condition.
However, in a case where a matching key image is obtained by scanning an enlarged/reduced document or an N-up document or obtained from a FAX job, the size of a document image in the matching key image is different from the size of an original image of the document image. In particular, in an image job log system etc. in which read documents are serially stored, an image obtained by scanning an enlarged/reduced document or an N-up document or obtained from a FAX job is generally accumulated with resolution at the time of obtaining the image, often resulting in that the size of a document image included in these images is different from the size of an original image of the document image (the size at the time of processing the document image in a default state).
Consequently, in a case where features of the matching reference image are calculated based on the size of the original image of the document image (the size at the time of processing the document image in a default state), even when the matching key image is read with the same resolution as the time of obtaining the matching reference image (default resolution), the substantial resolution between the matching reference image and the matching key image is different.
In a case where the substantial resolution between the matching reference image and the matching key image is different, for example, a plurality of connected pixel regions that are not connected with each other in fact are recognized as connected pixel regions that are different from each other in an image of high resolution, whereas the connected pixel regions are recognized as one connected pixel region in an image of low resolution. This results in a problem that accuracy in similarity determination drops and a result of the determination differs.
For example, the following explains a case of comparing image data (matching reference image) obtained by scanning A4 documents A and B with 600 dpi with image data (matching key image) obtained by scanning an A4 two-up document with 600 dpi including images of the A4 documents A and B.
In a case of an A4 two-up document including images of two A4 documents, the two images are reduced by approximately 0.7 times with respect to their original sizes (A4 size). Therefore, substantial resolution of individual document images (documents A and B) included in the two-up document is approximately 420 dpi.
Therefore, features obtained by scanning each of the A4 documents A and B with 600 dpi are compared with features obtained by scanning the two-up document with 600 dpi (combination of features obtained by scanning the documents A and B with approximately 420 dpi). In this case, results of extracting features from individual documents are different from each other, making it difficult to accurately determine similarities between a matching key image (two-up image of the documents A and B) and matching reference images (images of the documents A and B with original sizes). This causes erroneous determination such as determination that the matching key image is similar to neither the document A nor the document B and determination that the matching key image is similar to only one of the documents A and B.
Citation List
Patent Literature 1
Japanese Patent Application Publication, Tokukaihei, No. 7-282088 A (Publication Date: Oct. 27, 1995)
Patent Literature 2
International Publication No. WO 2006/092957A1, pamphlet (Publication Date: Sep. 8, 2006)
Non-Patent Literature 1
Tomohiro NAKAI, Koichi KISE, and Masakazu IWAMURA: “Document Image Retrieval and Removal of Perspective Distortion Based on Voting for Cross-Ratios”, Meeting on Image Recognition and Understanding (MIRU2005) (held by Computer Vision and Image Media Workshop, Information Processing Society of Japan), proceedings, page 538-545