As various moving pictures have been provided because of increases of large volumes of multimedia databases and developments of communication and digital media processing techniques, attempts to increase users' convenience and satisfaction through search services on the basis of summary information of abridged video have been executed. However, most video has been currently abridged by sorting and extracting appropriate scenes or images by a person's direct handling.
Demands of automatically analyzing a large amount of video have increased as various categories of business which relate to the video have been progressed, and accordingly, many studies for solving the above-noted problem have been actively proposed.
Video abridging methods are classified as video skimming, highlighting, and video summarization.
The video skimming scheme is a method for consecutively connecting parts which have important meaning extracted from video and audio data, and generating a short video synopsis. The highlight scheme is a method for sorting interesting parts from the video on the basis of predetermined events, and abridging them. The video summary is to sort out meaningful contents and structural information from the video. Video summary results are generally represented in a sequence of key frames (still images), and the studies on video abridgement aim at generating video summary information.
The video summary represented by the key frames allows a user to understand the whole video contents at a glance, and functions as an entry of the scenes or shots which have the corresponding key frames. Hence, the video summary task is also a task for selecting the optimal key frame or a task for selecting a segment at which the optimal key frame is located, and visual characteristics such as color and motion are used as important factors for selecting key frames.
The video summary is divided into shot-based summary and segment-based summary according to its application range.
The shot-based summary is a method for displaying short videos, that is, video clips with several key frames, and the segment-based summary is a skill for abridging the whole long video.
More studies have recently focused on the segment-based summary, because of the wider application ranges of the segment-based abridgment techniques. The disclosed invention also aims at the segment-based summary.
Methods for abridging divided video per segment include (a) a shot grouping method for analyzing the correlation between shots in the temporal window and grouping the shots with a high relation into a story unit (or a scene), and (b) a method for analyzing the characteristics of clusters obtained by conventional clustering and selecting important clusters.
These methods can further be fractionized depending on the case in which what visual characteristics are used or which shot is selected as a representative.
An important problem that the above-mentioned methods have in common is that the decision of representative excessively depends on threshold values. That is, the representative of shots is determined based on the established specific threshold value. For example, shots the importance of which is greater than the threshold value or those the importance of which is within the top 10% of priority are selected. The threshold values are experimentally determined. The problem that is caused from the video abridgment algorithms by severely depending on the experimental threshold values is that the video abridgment system can be very effective to some specific video but it is difficult to be applied to various types of video.
Also, this problem can be a fatal defect in the application fields of processing various categories of video information, and the task of setting the optimized threshold value experimentally requires a large cost.
Subjective decisions based on human decisions as well as the visual features may operate as important factors as to selecting the key frame for video summary.
When the user actually abridges the video manually, he can create a video summary that can move other people's hearts by introducing the subjective decision. Therefore, a study for applying the subjective decision to the video abridging process is needed in order to generate an effective video summary.
In addition, it is necessary to generate scalable video summary information in consideration of the user's environment in order to generate more effective video summary.