Conventionally, an image search technique for searching a DB (database) for a document image which is equal or similar to a predetermined document image at high speed has been proposed. For example, Japanese Patent Laid-Open No. 2001-319231 (patent reference 1) speeds up search processing by extracting a plurality of partial regions from a document image, specifying the number of extracted partial regions as a narrowing-down condition, and calculating a similarity to a document image which has the same number of regions using that feature amount.
However, in case of the search method described in the prior art, since only the number of partial regions is used as the narrowing-down condition, candidates cannot often be sufficiently narrowed down. Hence, in order to narrow down to the appropriate number of candidates, and to further speed up the search processing, it is desired that the narrowing-down condition includes not only the number of partial regions but also feature amounts of partial regions, and the like.
However, upon narrowing down candidates, if all the feature amounts of the partial regions reside on a memory, system cost undesirably increases. On the other hand, in order to avoid such problem, when all the feature amounts of each document image are saved in an HDD (hard disk drive), file accesses take much time, resulting in a heavy image search time. Likewise, even when all the feature amounts are managed by a DB and those of a desired document image are directly referred to from the DB, database transactions take much time, and a long processing time is also required to read data from the HDD, resulting in distant implementation. Therefore, it is demanded to use optimal feature amounts so as to narrow down to the appropriate number of candidates, and to implement such narrowing-down processing at high speed and with low cost.
Furthermore, upon conducting image search like in patent reference 1, if feature amounts are compared with those of a document image having the same number of regions, adequate image search is often disabled due to over- or under-extraction upon extracting partial regions from a document image. For this reason, feature amounts and the like used in the narrowing-down condition must have flexibility.