Image retrieval systems are of importance for applications that involve large collections of images. Professional applications include broadcast stations where a piece of a video may be identified through a set of shots and where a shot of video is to be retrieved according to a given image. Also movie producers must be able to find back scenes from among a large number of scenes. Furthermore art museums have large collections of images, from their paintings, photos and drawings, and must be able to retrieve images on the basis of some criterion with respect to their contents. Consumer applications include maintaining collections of slides, photos and videos, from which the user must be able to find back items, e.g. on the basis of similarity with a specified query image.
An image retrieval system and a method as described above, are known from the article "Tools and Techniques for Color Image Retrieval", John R. Smith and Shih-Fu Chang, Proc. SPIE--Int. Soc. Opt. Eng (USA), Vol. 2670, pp. 426-437. The image retrieval system comprises a database with a large number of images. A user searching for a particular image specifies a query image as to how the retrieved image or images should look like. Then the system compares the stored images with the query image and ranks the stored images according to their similarity with the query image. The ranking results are presented to the user who may retrieve one or more of the images. The comparison of the query image with a stored image to determine the similarity may be based on a number of features derived from the respective images. The image feature or features used for comparison are called a feature vector. The article describes the usage of a color histogram as such a feature vector. When using the RGB (Red, Green and Blue) representation of an image, a color histogram is computed by quantizing the colors within the image and counting the number of pixels of each color. To determine the similarity, a number of techniques are described to compare the two color histograms of the respective images. An example of such technique is the histogram intersection, where the similarity is the sum over all histogram bins of the minimal value of the pair of corresponding bins of the two histograms.
In a practical set up, the number of images can be very large. On the Internet for example, the number of images can be of the order of millions and is ever growing. Even if the time to compare the query image with a candidate image is very short, the cumulative time needed to compare the query image with all images in the database will be long. It is a drawback of the known system that a user searching for an image in such a large database must wait a long time after having submitted the query image in the system.