With the rapid progress of multimedia and internet technology, digital multimedia contents on the internet, such as digital movies and videos, have increase rapidly in recent years. Although users can choose the movies they are interested via browsing the internet, users cannot make sure if the movies downloaded are what they think owing to the big data size of the videos and the restriction of the limited time and bandwidth of the internet. Users are therefore easily confused and inconvenienced. Therefore, it has become an important issue to set up a fast and effective video summarization system and the method thereof to let users skim over the whole video quickly by browsing the abstract, especially when retrieving and searching for digital videos.
Furthermore, due to the progress in computer technology and the popularity of the internet, increasingly more video information can be retrieved, such as the video data stored in libraries. Hence, there is a demand for the development of a fast and effective retrieval technique that uses key sentences and key frames of the video to help users find the desired videos.
However, conventional techniques to distinguish similar frames analyze the distribution of the color histogram in the video or analyze the similarity of the action in the video, and the frames determined can not be guaranteed to be the most representative frames in the video. The main drawbacks of conventional techniques to extract key sentences are the redundancy of extracted sentences and the failure to cover all of the content.