1. Field of Art
The present disclosure relates generally to web-based video display and specifically to software tools and methods for spam detection for online user-generated videos.
2. Description of the Related Art
Sharing of video content on websites has become a worldwide phenomenon, supported by dozens of websites. On average, hundreds of thousands of new videos are posted every day to various video hosting websites, and this number is increasing, as the tools and opportunities for capturing video become easy to use and more widespread. Many of these video-hosting websites also provide viewers with the ability to search for a video of interest. It is estimated that in 2006, there were over 30 billion views of user generated video content worldwide.
Users who upload videos onto the video hosting websites are able to add descriptions and keywords (also called tags) related to their video. These descriptions and keywords are stored as metadata associated with the video. The metadata is indexed, and thus allows viewers to search for videos of interest by entering keywords and phrases into a search engine on the video hosting website. Some user attempt to intentionally misrepresent the content of their video, so that their videos appear more often in the search results, and thus are seen by more viewers. These users employ various methods—sometimes called “spamdexing” or “keyword stuffing”—to manipulate the relevancy or prominence of their video in the search results, for example, by stuffing their descriptions with popular words or phrase in order to target these popular queries. This results in making it more difficult for viewers to find videos that actually related to the viewer's interests, as expressed in their keyword searches.