In daily lives, a long article we read is usually provided with a brief abstract, and a book is generally provided with a catalogue. Along with the continuous development of the information technology, a video becomes a kind of indispensable media in modern life. Accordingly, it becomes an important task to create an abstract for video contents so as to facilitate a user's browsing and searching. For a part of video contents (such as a movie), the video abstract can be created manually. However, for many video contents (such as network shared videos), the manual way becomes impractical due to a lot of hours and money to be consumed. For these applications, a technology for generating an abstract of a video automatically by a computer is very important.
Vision related parts in a video consist of a series of frames arranged in a temporal order. It is an intuitional and effective way to extract a most representative key frame as an abstract of the video from the frames. In existing technologies, it is generally to extract candidate key frames from the shots or sub-shots of a video. How to select a key frame of the entire video from the candidate frames to allow a user to effectively understand the outline of the video by browsing key frames as few as possible is an important technology for automatic video abstract.