The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.
Search engines help a user to locate information. Using a search engine, a user may enter one or more search query terms and obtain a list of resources that contain or are associated with subject matter that matches those search query terms. To find the most relevant files, search engines typically attempt to select, from among a plurality of files, files that include many or all of the words that a user entered into a search request. This works well when the files searched contain a large quantity of text where relevance and context may be determined. Images, however, present problems because often the only way to determine the content and context of an image is through small snippets of text associated with the image.
As photo sharing services on the Internet have become more popular, obtaining more effective search results of those images has become increasingly important. Photo sharing services allow users to upload and share digital images with family, friends, and the general public.
Users may be given the opportunity to provide annotations to each particular photo uploaded. These annotations, which may also be referred to as user-generated content, may define a title, description, and a set of tags for the photo. The set of tags might contain keywords to indicate the subject matter of the image. The photo annotations provided are essential to making the photos retrievable by text-based retrieval models and allow users to formulate keyword-based queries against the photo collection.
Due to the rich nature of the image content, and the limited expressiveness of keyword-based query formulation, it is often difficult for a user to precisely formulate his information request. One reason for this difficulty is that there is often little data associated with images. For example, an image might have data that is annotated by a user about the image. This might include the title, a description, and tags regarding the image. However, many more images lack even this rudimentary data. The data may also be noisy, meaning that the data is not relevant to the subject of the image. For example, a user might bulk upload hundreds of photos at the same time and annotate all of the photos with the same tag, without regard to content.
In addition, users often employ words in a query that may present ambiguities. Examples of word ambiguities are shown in FIG. 1. For example, a user might enter the query word “apple.” In response, the search engine might return image 101A and 101B. Image 101A shows “apple” as a fruit. Image 101B displays “apple” as a computer company logo. Unfortunately, based upon only the query word, it is not possible to determine whether the user's intent is to find results relating to the fruit or the company logo. This type of ambiguity is referred to herein as word-sense ambiguity. In word-sense ambiguity, the dominant topic that the word may relate to is the primary sense, and a secondary topic of the word is a secondary sense. For the query “apple,” the primary sense might be “corporate logo,” and the secondary sense might be “fruit.”
In another example, a user might enter the query word “jaguar.” In response, the search engine might return image 103A and image 103B. Image 103A shows “jaguar” as an animal. Image 103B displays “jaguar” as an image of a car from the automotive manufacturer, Jaguar. The intent of the user also may not be determined based upon the query word “jaguar” without determining more detailed context.
Different ambiguities may occur as well. A determination might be made that the user intended to find images for “apple” that are related to the computer company. Under this circumstance, the images sought may refer to any one of the computer company's products, logos, or events. In this case, the ambiguity is referred to as type-specific ambiguity.
Due to the rich nature of the image content, and the limited expressiveness of keyword-based query formulation, it is often difficult for a user to precisely formulate his information need. In the absence of disambiguating information, the user should be presented with a diverse set of images that embodies many possible interpretations of the user's query. When presented with results reflecting multiple senses of the query, the likelihood that the user's intention will be represented are greatly increased. While traditional information retrieval models focus on finding the most relevant document without consideration for diversity, effective image search requires results that are both diverse and relevant. Thus, methods that provide image search results that are both relevant and diverse are highly desirable.