Many video content-based copy identification systems have been developed to compare a broadcasted video program to a reference video database.
For example, U.S. Pat. No. 6,469,749 discloses a method to identify video segments that are likely to be associated with a commercial or other particular type of video content. A signature is extracted from each of the segments so identified, and the extracted signatures are used, possibly in conjunction with additional temporal and contextual information, to determine which of the identified segments are in fact associated with the particular video content. One or more of the extracted signatures may be, e.g., a visual frame signature based at least in part on a visual characteristic of a frame of the video segment, as determined using information based on DC and motion coefficients of the frame, or DC and AC coefficients of the frame. A given extracted signature may alternatively be an audio signature based at least in part on a characteristic of an audio signal associated with a portion of the video segment. Other types of signatures can also be used. That method allows the identification and extraction of particular video content to be implemented with significantly reduced amounts of memory and computational resources.
Another system is described in U.S. Pat. No. 6,587,637 in which video images are retrieved by sequentially inputting images for each frame, sequentially extracting features from the inputted frame images, converting the features sequentially extracted into a feature series corresponding to the inputted frame image series, compressing the feature series in the direction of the time axis, storing the compressed feature series in the storage, sequentially extracting features separately from the images to be retrieved for each inputted frame, sequentially comparing the features of the images to be retrieved for each frame with the stored compressed feature series, storing the progress state of this comparison, updating the stored progress state of the comparison on the basis of a comparison result with the frame features of the succeeding images to be retrieved, and retrieving image scenes matching with the updated progress state from the images to be retrieved on the basis of the comparison result between the updated progress state and the features of the images to be retrieved for each frame.
In a public document entitled “Robust Content-Based Video Copy Identification in a Large Reference Database” disclosed in 2003 during the Internation Conference on Image and Video Retrieval (CIVR), a novel scheme for video content-based copy identification dedicated to TV broadcast with a reference video database exceeding 1000 hours of video was disclosed. It enables the monitoring of a TV channel in soft real-time with a good tolerance to strong transformations that one can meet in any TV post-production process like: clipping, cropping, shifting, resizing, objects encrusting or color variations. Contrary to most of the existing schemes, the recognition is not based on global features but on local features extracted around interest points. This allows the selection and the localization of fully discriminant local patterns which can be compared according to a distance measure. In the disclosed document, retrieval is performed using an efficient approximate Nearest Neighbors search and a final decision based on several matches cumulated in time.
As for many content based retrieval systems and as seen in the above-mentioned prior art, one of the difficult tasks is the cost to search similar objects in a large database DB.