Conventionally, there have been proposed a large number of search devices or search systems for searching for predetermined image data in a storage device in which image data are stored. For example, each of Patent Documents 1 to 3 listed below discloses such a technique that when a keyword by which a search process is performed has been entered, an electronic document related to the keyword is retrieved from among electronic documents (image data) stored in a storage device.
Further conventionally known as a search technique different from such a technique is a “similarity search” (one example of which is described in Paragraphs [0003] and [0004] of Patent Document 4 listed below). According to the similarity search, in cases where an electronic document A by which a search process is performed has been entered or designated, an electronic document B judged to share a common attribute with (to be similar to) the electronic document A is retrieved from a storage device.
The following fully describes the similarity search. In a database system to which the similarity search is applied, all electronic documents are associated with keywords indicative of attributes of the electronic documents, respectively. Then, in cases where an electronic document A by which a search process is performed has been entered or designated, an electronic document B associated with a keyword identical to a keyword associated with the electronic document A is retrieved from a storage device.
In the database system to which the similarity search is applied, there are various ways of selecting a keyword that is associated with an electronic document. For example, there is such a technique that a group of relatively large font characters contained in an electronic document, a group of special font characters contained in an electronic document, or a group of characters contained in the first item and uppermost line of an electronic document is selected as a keyword that is associated with the electronic document. Alternatively, there is such a technique that in generating an electronic document by scanning, a group of relatively large font characters or the like is selected as a keyword from among groups of characters recognized by a character recognition technique such as OCR (optical character reader). These techniques make it possible that the title of an electronic document or a word particularly emphasized in the electronic document is selected as a keyword that is associated with the electronic document. For example, in the case of an electronic document named “Guide to a Ski Tour”, words such as “ski”, “tour”, and “guide” can be selected as keywords.