With the huge amount of video data uploaded to the Internet every day, how to analyze users' interests and to recommend videos that the users are potentially interested in is a big challenge. Most content-based recommendation systems limit the content to metadata associated with videos, which could lead to poor recommendation results since the metadata is not always available or correct. For these videos, either a lot of efforts need to be spent in manually annotating them or automatically tagging methods have to be applied, otherwise these systems would fail to recommend personalized videos.
On the other hand, visual contents of videos containing information of different granularity, from the whole video to portions of a video and to an object in a video, are not fully explored.
The disclosed methods and systems are directed to solve one or more problems set forth above and other problems.