Techniques for searching a body of text documents are well-established. In general, such techniques compare the words in a query with the words in a document. While there are many different algorithms to perform the comparison, the comparison is simplified by the fact that both the query and the text documents are represented in the same medium—i.e., words.
Non-text context, such as video, is normally searched using a text query. In order to perform the search, textual tags are often applied to the non-text content. For example, various sites for sharing still images or videos allow one who posts the content to tag the images or video with names, keywords, categories, etc. Some image-sharing applications allow a user to tag specific regions of an image—e.g., a user might be able to tag a specific region of a photo with the name of the person who appears in that region.
These tagging techniques are dependent on human effort. Thus, in general, the only content that gets tagged is content that interests someone enough to apply a tag, or content with sufficient commercial value to make it worth it to pay for the human labor to tag the content. Image and video sharing sites, and social networking sites, are fairly adept at leveraging people's interest in certain types of content, in order to get people to spend the effort to tag that content. For example, users of social networking sites often tag images or videos that contain pictures of themselves or their friends, because the images are interesting to the person doing the tagging. However, there is a vast body of content that will not get tagged under this paradigm, and is therefore relatively unsearchable. For example, one might want to search video news footage for stories about a certain person or topic. However, most news footage either does not get tagged at all, or gets tagged only with high level concepts. Moreover, if tags are applied at all, the tags are typically applied to the video as a whole, rather than to specific segments of the video
Searches on video could be more effective if individual videos could be tagged with detailed information about the content of the videos.