Most existing commercial image search engines use text associated with images as the basis to retrieve images based on the assumption that the associated text of images, including tags, captions and surrounding text, are usually relevant to the image content. This may lead to unsatisfactory visual results due to a lack of consideration for the visual aspects of the images since the results rely solely on the associated text of the images.
Since the visual perception of human beings for images is different from the perception for text, a gap exists between a user's intention and the text-query based searching techniques. This disparity leads to inconvenience and inefficiency for the user since she has to browse through a significant number of images obtained from the textual-based search query to locate a desired image.
Currently there is a lack of an intuitive overview of image search results. For example, if the user would like to get a quick overview of returned images, she has to either click through several pages, each bearing numerous images, or drag through a scroll bar to look through all the images. Moreover, even after the user has viewed all the images, it is still not easy for the user to effectively get a sense of the distinctive types of image embodied within a large number of images returned based on the textual-based search query.