Recent years have witnessed an explosive growth of multimedia data including images and videos readily available on the Internet. With the exponential growth of video sharing websites (e.g., YouTube™, Google Videos™, Yahoo™ Video, etc.) the number of videos searchable on the Web has tremendously increased. However, much of the media content available is redundant, overlapping, or contains duplicate material. Organizing videos on the Internet, to avoid duplicates or to perform research regarding duplicates, still remains a challenge to researchers in the multimedia community.
Much of the difficulty regarding organizing (e.g., indexing, cataloging, annotating, ranking, etc.) videos on the Internet involves problems with efficiency and with scalability. Most of the work in the area of near-duplicate video detection focuses on handling various photometric or geometric transformations. These techniques are not well equipped to handle a Web-scale video database and return search results in real time. Additionally, manual annotation and organization of videos represents a very labor intensive and time consuming task.