1. Field
The present application relates generally to digital media and more specifically to the process of quickly, efficiently and accurately retrieval similar videos based on extracted feature comparison.
2. Description of the Related Art
Regarding content based video retrieval, one of two main approaches are usually employed. The first is related to matching specific extracted key frames from one video to another. Key frames are extracted at regular intervals, or sometimes selected by scene change detection algorithms. A popular approach is to simply compare key frames of videos using new or existing content-based image retrieval (CBIR). The second is related to modeling the entire clip, and performing a model based comparison during the retrieval. Another existing video retrieval technique is to model entire video clips in some manner, and then perform a model comparison during retrieval. While other models are available, the main model used is a temporal model.
Key Frame Comparison
Key frames are often extracted at regular intervals, or sometimes selected by scene change detection algorithms. A popular approach is to simply compare key frames of videos using new or existing content-based image retrieval (CBIR). However, this analysis suffers from two large shortcomings.
Some specific examples of existing technology that utilizes key frame comparison for video retrieval are as follows.
1) U.S. Pat. No. 5,982,979
The video retrieving method provides a video retrieval man-machine interface which visually specifies a desired video out of many stored videos by using previously linked picture data corresponding to the videos. Also, a video reproduction operating man-machine interface visually designates the position of reproduction out of the picture group indicative of the contents. The video retrieving method employs video data, character information linked to the video data, picture data linked to the videos, and time information corresponding to the picture data in the video data. The character information is composed of a title of each video and a creation date thereof. The picture data include, as retrieval information, one picture data representing the content of the relevant video (one picture expressing the video, i.e., a leaflet or the like), and a plurality of picture data adapted to grasp the contents of the entire video. The time information indicates the temporal position of the picture data in the video data.
Hauptmann, A. G., Christel, M. G., and Papernick, N. D., Video Retrieval with Multiple Image Search Strategies, Joint Conference on Digital Libraries (JCDL, '02), Portland, Oreg., pp. 376, Jul. 13-17, 2002 describes the Informedia digital video library which provides automatic analysis of video streams, as well as interactive display and retrieval mechanisms for video data through various multimedia surrogates including titles, storyboards, and skims.
Another existing video retrieval technique is to model entire video clips in some manner, and then perform a model comparison during retrieval. While other models are available, the main model used is a temporal model.
One example of existing technology that utilizes temporal modeling for video retrievals is in Chen, L. and Stentiforda, F. W. M., Video sequence matching based on temporal ordinal measurement, Pattern Recognition Letters, Volume 29, Issue 13, 1 Oct. 2008, Pages 1824-1831. That paper proposes a novel video sequence matching method based on temporal ordinal measurements. Each frame is divided into a grid and corresponding grids along a time series are sorted in an ordinal ranking sequence, which gives a global and local description of temporal variation. A video sequence matching means not only finding which video a query belongs to, but also a precise temporal localization. Robustness and discriminability are two important issues of video sequence matching. A quantitative method is also presented to measure the robustness and discriminability attributes of the matching methods. Experiments are conducted on a BBC open news archive with a comparison of several methods.
Another approach using temporal modeling is described in Chen, L., Chin, K. and Liao, H., An integrated approach to video retrieval, ACM International Conference Proceeding Series Vol. 313, Proceedings of the nineteenth conference on Australasian database—Volume 75, 2008, Pages 49-55. There it is described that the usefulness of a video database depends on whether the video of interest can be easily located. This paper proposes a video retrieval algorithm based on the integration of several visual cues. In contrast to key-frame based representation of shot, the approach analyzes all frames within a shot to construct a compact representation of video shot. In the video matching step, by integrating the color and motion features, a similarity measure is defined to locate the occurrence of similar video clips in the database.
U.S. Pat. No. 7,486,827 describes a two-step matching technique is embodied in a video-copy-detection algorithm that detects copies of video sequences. The two-step matching technique uses ordinal signatures of frame partitions and their differences from partition mean values. The algorithm is not only robust to intensity/color variations it can also effectively handle various format conversions, thereby providing robustness regardless of the video dynamics of the frame shots.