Techniques for identifying highlight portions of a recorded video stream can be classified in two general categories: (1) automatic video highlight detection; and (2) manual home video editing.
Regarding the first category, a first technique is known which detects highlights of a domain specific recorded video, such as a sports video or news video. Prior domain knowledge of the domain specific video can be used to identify highlights of the video. For example, a specific event, such as the scoring of a touchdown in football, can be identified using anticipated video characteristics of video segments which contain this event based on a priori knowledge. Predefined portions of a video can be detected as a highlight, with desired highlights being associated with various types of scene shots which can be modeled and computed. By detecting the occurrence of a scene shot, the user can estimate the occurrence of a desired highlight.
Representative portions of a video sequence can be related to a highlight to compose a skimmed view. Predefined audio cues, such as noun phrases, can be used as indicators of desired video portions.
Regarding the second category, one technique for identifying specific segments of a domain generic video is available with the Adobe Premiere Product. With this product, specified portions of a video stream considered of interest are identified manually. In contrast to domain specific video, generic videos (e.g., home videos) do not contain a specific set of known events. That is, a priori knowledge does not exist with respect to generic videos because little or no prior knowledge exists with respect to characteristics associated with portions of the video that may constitute a highlight.
Home video annotation/management systems are known for creating videos from raw video data using a computed unsuitability “score” for segments to be contained in a final cut based on erratic camera motions. A set of editing rules combined with user input can be used to generate a resultant video.
A time-stamp can be used to create time-scale clusters at different levels for home video browsing. Video browsing and indexing using key frame, shot and scene information can be used. A face tracking/recognition functionality can be used to index videos.
Exemplary editing software for video editing involves having the user manually identify a specified portion of a video stream by first browsing the video to identify an interesting video segment. The user then plays the video forward and backward, using a trial and error process to define the start and end of the desired video segment. Upon subjectively locating a desired portion of the video stream, the user can manually zoom in, frame-by-frame, to identify start and end frames of the desired video segment.