The present invention relates to: an image processing method and an image processing apparatus for judging whether an obtained document image is similar to a preliminary reference image; and an image reading apparatus and an image forming apparatus employing this image processing apparatus.
A technique is known in which a document consisting of a plurality of pages is partitioned at desired pages so that the document is classified, and then the page images of individual classified documents are filed separately. In an exemplary method, partition sheets having an identification mark are inserted at breaks of the document in advance. Then, when an identification mark recorded on a partition sheet is detected among the page images obtained by reading the document through an image reading apparatus such as a scanner, the document is partitioned. In another exemplary method, when a document is to be read by an image reading apparatus, the numbers of pages for partitioning a document are specified in advance. Then, when the document of a specified number of pages is read, the document is partitioned.
Further, in order that filing processing for page images obtained by reading a document should be achieved in a short time, an image filing apparatus is proposed in which: page images of a plurality of sub-documents are read successively and stored; then, index information is generated for referring to each of the page images; then, index information for each page is stored in a manner partitioned for each sub-document, on the basis of the specified number of pages for one sub-document; so that without the necessity of performing filing processing on page images at each time when page images for one sub-document have been read, filing processing is performed document by document on the page images of a plurality of sub-documents (see Japanese Patent Application Laid-Open No. H8-7071).
On the other hand, as a technique of matching a page image obtained by reading a document with a predetermined image stored in advance and thereby judging similarity of the image, a method is known in which, for example, keywords in a page image obtained by reading a document are extracted by an OCR (Optical Character Reader) and then similarity of the image is judged on the basis of the extracted keywords. In another method, documents where similarity judgment is to be performed are limited to sheet forms containing ruled lines and then features of the ruled lines are extracted from a page image obtained by reading a document so that similarity of the image is judged.
Further, a matching apparatus is proposed in which features of an input document are extracted so that a descriptor is generated. Then, the descriptor is matched with descriptors stored in advance in a descriptor database, so that a document having entire or partial conformity with a descriptor stored in the descriptor database is searched from the input documents (see Japanese Patent Application Laid-Open No. H7-282088).