As people view more and more online content, especially non-textual content (which includes images, video, audio, and the like), they are ever curious regarding that content and often would like to know more. For example, as a person (a computer user) views content of an associate's online blog, the computer user may encounter an image of which he/she may wish to know more. Indeed, even when the image is captioned and/or the surrounding content describes aspects of the subject matter of the image, the view/computer user may still have questions that are not answered, at least not answered without taking specific actions to find the information.
Alternatively, a computer user may want to perform a Web search task that is difficult to formulate in words, for example, buying a fashion item by its look, finding a travel destination (or other location) based on a photo, and so forth. Searching by image (vs. searching by text) is the logical thing to do in such scenarios. Many such tasks are research in nature—people search without a definitive object to acquire, but rather they need a lot of visual assisted exploration to form opinions and narrow down to what exactly to pursue. Some providers have implemented a general search-by-image feature which sometimes returns a text annotation for the query image, but the search relevance and annotation coverage are not typically satisfactory. Some applications utilize human crowdsourcing for annotations and present corresponding text search results. The annotations are often too vague, and because the search results are based on the textual annotations, the results are not visual centric. In general, relevance of the search results is not satisfactory and an efficient the interactive refinement experience is not provided. Also missing is the platform support to commoditize visual search and enable a vibrant ecosystem.