There is an increasingly large volume of video, audio, movie, television, music, and other media content (“asset” or “media asset”) being published to the Internet and to the World Wide Web (“web”) by asset providers. Videos can now be found on a wide variety of web sites. Videos are also found on the non-web portions of the Internet, such as on music stores, on peer-to-peer file sharing networks, and on Internet-enabled set top boxes.
Some assets are embedded on web pages using multimedia programs such as Flash. Some are stored on web servers and linked via HTML hyperlinks. Some are on a peer-to-peer network such as those using the BitTorrent protocol. Many media providers use proprietary web pages with assets classified using visible and intentionally/unintentionally obfuscated metadata.
Video search engines have been developed to search for Internet videos. Some video search engines allow searching for videos that are on web sites. Some video search engines allow searching for videos that are on peer-to-peer networks.
A common technique for web video search engines is to locate the text describing a particular video (“video description”), index the text, and subsequently return the associated video when a user's search query matches the video description. The video description may be extracted from the web page on which the video is embedded or linked from which it is linked or from the metadata of the video file. The video description is often short, limited, and/or vague. Therefore, a user's search query may not necessarily return the desired search results.
For peer-to-peer video search engines, queries may be set to match against the filename or metadata of the video. The metadata may include a video description that is similar to web video descriptions in that it is short, limited, and/or vague. Often there is only limited text associated with assets. For example, a web-embedded video may only have a short description consisting of the name of the show and the broadcast airdate of that particular episode of the show. In this case, search methodologies that use matching query word, word proximity, location of terms within the result, and so forth are unable to differentiate the ranking of different videos since the available words in the associated text are limited.
Publishers, media providers, and media aggregators/portals would be better served with either the ability to search and/or identify assets better.