Consuming video content involves a significant time commitment. Unlike photos, which may be consumed almost instantly, a user must view an entire video before identifying sections of desirable content. For example, if a user takes one hundred videos while he or she is on vacation and subsequently wants to show friends and family some of the best videos from his or her vacation, he or she will need to watch the entire video collection to identify which videos he or she wants to share.
Current techniques for optimizing video consumption extract some features from a video (e.g., histograms, faces, audio power, speech, etc.) and apply rules to the extracted features. Generally, the rules are applied locally and/or temporally. The rules, however, are not generalized for application to diverse types of video content and accordingly, many rules become contradictory when optimized for a specific type of content.
In addition to contradictory rules, another problem with current techniques is that individuals have varying perceptions of what makes a portion of video content desirable and current techniques do not account for the subjectivity involved in identifying desirable sections of video data.
Furthermore, current techniques are directed to video content having multiple scenes within a single video file. Windows Movie Maker, for example, is geared towards longer duration video content having multiple scenes within a single video file. Windows Movie Maker detects low level features from video content and uses these features to create a summary of the video content by selecting important segments from all parts of a video file. With modern technologies, users create video content different from in the past. For example, current video content may be recorded in a digital format such that each video file is typically a short scene. Accordingly, current techniques are insufficient for identifying desirable portions of video content in video data.