Search engines help a user to locate information. Using a search engine, a user may enter one or more search query terms and obtain a list of resources that contain or are associated with subject matter that matches those search query terms. To find the most relevant files, search engines typically attempt to select, from among a plurality of files, files that include many or all of the words that a user entered into a search request.
The list of resources that search engines return based upon a particular query may vary. For example, a search engine might return links that are associated with items that include but are not limited to web pages, online documents, web applications, and multimedia objects. As multimedia becomes more ubiquitous on the Internet, the search results that are returned might include an increasing number of multimedia objects. For example, submitting a query for the term “ford” may return results in the categories of websites, images, music, and videos. One difficulty that affects a search engine's return multimedia content (including but not limited to images, video, and music) results is that the multimedia content is often difficult to classify. For example, web pages often have text and outbound links that a search engine may readily analyze in order to determine the subject matter of the content of the web page. Multimedia content might not have text that could be used to help classify the multimedia content. The lack of text in the multimedia content makes the multimedia content items difficult to distinguish from each other.
With increasing broadband speeds and computing power, the availability of multimedia content on the Internet has greatly expanded. Users are more likely to initiate searches with the goal of locating multimedia content, and video content in particular. Sometimes, though, duplicates of the same video content items may be included within the search results that are returned for a particular search query. Duplicates in video search results may decrease the amount of unique results for the user and lessen the effectiveness of the search with respect to the user. With too many duplicates returned, a user may be tempted to try other search engines for the video search. Thus, enhancements that generate more effective search results by removing duplicate results for video content have become increasingly important.
The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.