1. Field of the Invention
The present invention relates to a method for recognizing content in image sequences.
2. Description of the Background Art
With an increase of traffic on video sharing websites there is a growing demand for techniques to classify an image sequence in order to give the flood of information a structure for easing its use and searchability. On the other hand providers of such video sharing websites are under increasing pressure on the part of copyright holders to make sure their copyrights are not violated by distribution of copyrighted video footage. Framewise comparison of the image sequence that users want to upload is impracticable because of the huge amount of calculating power and memory necessary. Furthermore the provider would have to own a copy of every copyrighted movie. An approach for achieving the object needs to extract metadata describing the image sequence and comparing them to sets of metadata assigned to individual movies stored in a database thus tremendously reducing the necessary memory. Such an approach has been recently described by Mark Everingham, Josef Sivic and Andrew Zisserman, Department of Engineering Science, University of Oxford, in “Hello! My name is . . . Buffy”—Automatic Naming of Characters in TV Video. In this publication a method for automatically labelling appearances of characters in TV or film material is presented, which combines multiple sources of information: (i) automatic generation of time stamped character annotation by aligning subtitles and transcripts; (ii) strengthening the supervisory information by identifying when characters are speaking; and (iii) using complementary cues of face matching and clothing matching to propose common annotations for face tracks.
The drawback of this approach is that subtitles are available only in image sequences on DVDs and that these subtitles can easily be removed thus making content recognition impossible. Transcripts are normally not publicly available but for a fraction of all copyrighted videos and need to be tediously collected from a huge number of sources distributed over the internet. This approach may consequently ease content based search within a video but is less adequate for preventing copyright violations.