Search engines, such as those for the World Wide Web (“web”), typically allow a user to enter a search query in the form of one or more search terms. In response to the query, a search engine returns a list of ranked results. The ranking of each result is typically based on a variety of factors, including: the number of matching query words in the result page; the proximity of matching words to one another in the result page; the location of terms within the page; the location of terms within specific tags of the page; the anchor text on pages pointing to the result page; how recently each page has been updated; link analysis of pages pointing to this one; and click-through analysis, such as the frequency by which the result is clicked on.
There is a large volume of video, audio, and other media content (“media content”) being posted to the Internet and to the web. Some media content is embedded on web pages using multimedia programs such as Flash. Some is stored on web servers and linked via HTML hyperlinks. Some is on a peer-to-peer network such as those using the BitTorrent protocol.
Search engines have been developed to search for media content. Similar to traditional search engines, media content search engines return a list of ranked results based on a user search query. However, given the particular characteristics of online media content, media content search engines that use ranking methodologies designed initially and/or primarily to find text or other non-media content may not return the most relevant ranked list.
Often there is only limited text associated with media content. For example, a web-embedded video may only have a short description consisting of the name of the show and the broadcast airdate of that particular episode of the show. In this case, ranking methodologies that use matching query word, word proximity, location of terms within the result, and so forth are unable to differentiate the ranking of different videos since the available words in the associated text are limited.
While link-analysis can typically assist in ranking media content with similar matching terms, link-analysis relies on the availability of a meaningful number of hyperlinks to the media content. However, because much web media content is generated by client-side technologies such as JavaScript and Adobe Flash, their unique URL may not be immediately apparent to end users or to standard web crawlers. Thus, the set of available hyperlinks may be smaller than optimal, making link-analysis less useful.
While click-through analysis is a good solution for older media content for which search engines have captured a large set of click-through history, for recently added media content with less click-through data, the resulting ranking can be inconsistent.