Advances in network technology and the like have resulted in an enormous number of image files to be managed. There has been available an image search method for searching the enormous number of images to retrieve an image similar to an image (query image) serving as a query. As one of the image search technologies, there is one that uses a local feature amount indicating a local feature of an image, and with general methods, a score indicating a degree of similarity to the query image is calculated in a simplified manner in order to retrieve similar images from a large amount of images at a short response time.
One of the above-mentioned methods is called “bag of features (BoF) method”. This method is obtained by applying a document search method called “bag of words (BoW) method”. In the BoF method, each of local feature amounts (hereinafter referred to as “image feature amounts”) extracted from an image to be searched is previously stored in association with a visual word corresponding to a word used in the BoW method. Specifically, the image feature amounts are subjected to clustering so that a cluster corresponds to the visual word. Then, a plurality of local feature amounts (hereinafter referred to as “query feature amounts”) are extracted from an image serving as a query when a search is performed, to obtain the visual word corresponding to each of the local feature amounts. Then, an appearance frequency of the visual word is statistically processed, to thereby generate a score indicating a degree of similarity between the query image and the image to be searched and retrieve similar images.
In Patent Literature 1, there is disclosed an outline of a method for searching for and retrieving an image by using the visual word, and a technology for searching for retrieving an image by combining an image feature amount vector extracted from a given image with a media feature amount vector extracted from sentences associated with the given image.