Digital still cameras and digital video cameras capable of directly shooting digital still/video images are widespread in recent years and accordingly it has been possible to accumulate a great amount of digital images as data and refer to the accumulated data with ease. For example, websites such as the still image sharing service “Flickr” (http://www.flickr.com) and the video image sharing service “YouTube” (http://www.youtube.com) are providing services that allow users around the world to freely register, search for and refer to still/video images. Further, most online merchandising websites and auction websites are supplying ample photographic images in order to let the users check and compare the articles.
Furthermore, video images shot with surveillance cameras are also being accumulated as digital data. The efficiency of the search for individuals, objects, etc. is being increased by carrying out the search by means of image analysis with a computer.
If, in a search for a digital image, a digital image of an object identical with or similar to the target of the search is at hand, resembling images can be searched for and retrieved at high speed by conducting the image matching between image data using a computer (see Non-patent Literature 1, for example). However, it is sometimes not easy to acquire or generate such a digital image as the sample for the matching. For example, when the real thing to be searched for has not been seen and an image has to be retrieved by using a clue such as a story heard from a person or a description in a document, the image search has to be conducted based on a description that expresses features of the image by use of words.
There is a relevant technique for searching for an image based on a query text (query) described in a natural language. In such a method the search is carried out by matching the query text with each piece of metadata (described with words and phrases in a natural language) previously assigned to each image. There is also a method that conducts the search by converting natural language expressions regarding colors and shapes included in the query text into feature quantities of the image, respectively.
In cases where the former searching method using the metadata is employed, a process basically identical with the ordinary keyword search is carried out. For example, Non-patent Literature 2 discloses a method capable of conducting the image search similarly to the document full-text search by assigning the metadata to each image (automatically extracted from a document (blog)) by using a tool. In such a method that matches the metadata with the query text, it is necessary to previously assign a necessary and sufficient amount of metadata to every image. Thus, the operational cost and the degree of comprehensiveness of the metadata become problematic.
In the latter searching method, there exists a method that conducts the search by converting a natural language expression regarding a shape into a shape feature quantity of the image. For example, Non-patent Literature 3 discloses a method that carries out the search by associating sensuous words such as “angular” and “clear-cut” with shape feature quantities that express the lengths, angles, etc. of parts of an object (e.g. chair) included in the image by use of symbols and numerical values.
The latter searching method also includes a method that conducts the search by converting a natural language expression regarding a color into a color feature quantity of the image. For example, Patent Document 1 discloses a method that searches for a garment, a landscape photograph, etc. by associating a color-related expression such as “blue” or “yellow-like” with a color feature quantity expressed as distribution of values in a color space (RGB, HSI, etc.). Such an image search method using the color feature quantity is usable also for a search for an image to which no metadata has been assigned previously. The image search method using the color feature quantity is effective also for cases where the image search using the shape feature quantities cannot be performed effectively (when items in the same shape (T-shirts, handkerchiefs, etc.) are specially retrieved, when a photographic image of a subject having indefinite shapes (e.g., natural landscape) is retrieved, etc.).