The present disclosure relates to the identification of important video frames and segments. For certain multimedia content that is divisible into shorter video segments, it is often the case that some segments of the video are more important to potential viewers than others. Recorded television programs, news broadcasts, or video outputs from a security camera, to name just a few examples, may have certain segments particularly relevant to users. Thus, several prior attempts have been made to automatically identify video frames or segments that may be relevant to potential viewers.
In one prior method, a video sequence is divided into segments of different lengths. For example, a television broadcast may be divided into segments based so that each segment corresponds to a scene. Then, the video segments with longer lengths are assumed to be the most relevant ones. Thus, segment length is measured for all segments, and portions of the longest segments are selected and displayed to the user. However, the assumption underlying such methods, that longer video segments tend to contain important scenes or stories, often proves unreliable. Further, since these methods are not based on the content of the video, the selected video segments become little more than blind guesses when the segment length assumption breaks down.
In another prior method, an object-based approach is used to analyze the individual video frames to identify relevant shapes, for example, a human head in a news broadcast. When a relevant shape is found, the segment is determined to be more important than other segments without relevant shapes. Such methods, while based on the video content, may be computationally expensive to implement. They may require first uncompressing the data, then executing expensive algorithms to identify the relevant shapes. Such object-based approaches are therefore unavailable to many systems with less processing power, such as mobile terminals.
Accordingly, there remains a need for methods and systems of identifying relevant segments in video and multimedia content, such as compressed domain video streams.