The convergence of networks, devices, and services combined with the technological advancements in digital storage, multimedia compression, and miniaturization of digital cameras has led to an explosive growth of online video content. In addition to the professionally produced video content, user-generated content and content produced by hardcore amateurs are also on the rise. Videos can easily be shared over the Internet using popular video sharing sites such as YouTube and Yahoo! Video. Increasing volumes of online digital video content and large amount of information contained within each video make it a challenge to search and retrieve relevant video files from a large collection. Video data management systems aim at reducing this complexity by indexing the video files.
Indexing of video content as well as many digital watermarking algorithms require the video to be split into scenes. Scene change detection (SCD) is used for segmentation of videos into contiguous scenes. Scene change detection is instantly performed by human but vast computational resources and efficient complex algorithms are required to automate this process. Scene change detection in videos is a primary requirement of video processing applications used for the purpose of generating data needed by video data management systems and digital rights management (DRM) systems. Scene change detection is a fundamental step in content based video retrieval systems, video watermarking systems, video fingerprinting systems, video annotation systems, video indexing methods and video data management systems. Scene change data can be used in DRM systems for effective intellectual property rights protection by means of watermarking and fingerprinting selected scenes.
A video is a sequence of scenes and a scene is a sequence of images called frames. Scene changes in videos can either be gradual or abrupt. Abrupt scene changes can result from editing cuts. Gradual scene changes result from spatial effects such as zoom, camera pan and tilt, dissolve, fade in, fade out, etc. Detection of scene changes effectively depends on finding the similarity or the difference between adjacent frames. SCD usually involves measurement of some differences between successive frame images. There are several metrics used to compute the difference between two frames. Template matching, histogram comparison, and χ2 color histogram comparison are some of the techniques used to measure the inter-frame difference.
The existing scene change detection algorithms can be classified into two groups. One group is compressed domain which consists of algorithms that operate on compressed data and other group is uncompressed domain/Pixel domain which consists of algorithms that operate on pixel data.
The algorithms in compressed domain operate on compressed data, like algorithms based on Macro blocks in MPEG compressed video, algorithms based on motion characterization and segmentation for detecting scene changes in MPEG compressed video, algorithms based on statistical sequential analysis on compressed bit streams, algorithms based on feature extraction based on motion information and vectors or edges or luminance information.
The algorithms in uncompressed domain/pixel domain operate on pixel data directly like algorithms based on color diagrams, algorithms based on color histogram and fuzzy color histogram, algorithms based on edge detection and edge difference examinations, algorithms based on background difference and tracking and object tracking.
U.S. Pat. No. 7,110,454 discloses a system and method for detecting scene changes in a sequence of video frames utilizing a combination of a plurality of difference metrics including an interframe difference metric, a histogram difference metric and an interframe variance difference metric, as well as adaptive threshold level selection methods to dynamically select appropriate threshold levels for each of the difference metrics. The interframe and histogram difference metrics are used to identify abrupt scene changes and the interframe variance difference metric is used to identify gradual scene changes. The identified gradual and abrupt scene changes are validated by applying a plurality of conditions.
U.S. Pat. No. 5,099,322 discloses a system which detect scene changes in a sequence of video images by analyzing the sequence for abrupt frame-to-frame changes in certain image features. The system accepts the signal into a quantizer, which digitizes the image, and stores it into a frame buffer. An image processor, a component of the system, analyzes the digitized images, and determines certain features which a decision processor can use to detect a scene change.
US 2003228056 discloses a process and apparatus for identifying abrupt cuts or scene changes in any ordered sequence of images. In one specific embodiment, two or more consecutive images from a sequence are introduced to a segmenter as digital frames. The segmenter independently divides each of these frames into pixel regions or segments according to some common characteristic so that every pixel belongs to exactly one segment. A segment analysis unit then performs some statistical analysis on the segment data for each of the frames and generates composite statistics for each frame. A frame comparison unit then examines these composite statistics to determine whether these frames belong to a consistent scene of images. If the composite statistics for these frames differ sufficiently, the comparison unit declares the latter frame in the sequence to belong to a new scene. This information may then be transmitted back to the data source for the purpose of marking the scene change or for any other purpose.
WO/2007/142646 discloses an apparatus and method for detecting scene change by using a sum of absolute histogram difference (SAHD) and a sum of absolute display frame difference (SADFD). The apparatus and method use the temporal information in the same scene to smooth out the variations and accurately detect scene changes. The apparatus and method can be used for both real-time (e.g., real-time video compression) and non-real-time (e.g., film post-production) applications.
WO/2007/078801 discloses a system and method for scene change detection in a video sequence employing a randomly sub-sampled partition voting (RSPV) algorithm. In the video sequence, a current frame is divided into a number of partitions. Each partition is randomly sub-sampled and a histogram of the pixel intensity values is built to determine whether the current partition differs from the corresponding partition in a reference frame. A bin-by-bin absolute histogram difference between a partition in the current frame and a co-located partition in the reference frame is calculated. The histogram difference is compared to an adaptive threshold. If the majority of the examined partitions have significant changes, a scene change is detected. The RSPV algorithm is motion-independent and characterized by a significantly reduced cost of memory access and computations.
US 20110051809 discloses scene change detection in encoding digital pictures. A statistical quantity .mu..sub.M is calculated for a given section in a current picture. A window of one or more sections is defined around a co-located section in a previous picture. A statistical sum E is calculated over the sections in the window. A difference between the statistical sum E and the statistical quantity .mu..sub.M is calculated. The difference between E and .mu..sub.M is used to determine whether the given section is a scene-change section. Whether the current picture is a scene-change picture may be determined from the number of scene change sections. Information indicating whether or not the current picture is a scene-change picture may be stored or transferred.
US 20060239347 discloses a method and system for rate estimation in a video encoder. The method and system use a motion estimation metric to determine the position of a scene change. The average of the motion estimation metric is computed for a set of pictures. When change in the motion estimation metric average exceeds a threshold, a scene change is declared. Declaration of a scene change prior to video encoding enables a corresponding bit allocation that can preserve perceptual quality.
The existing technologies have various limitations. They do not identify the scene change with high precision and recall. The efficiency is low because of high false positive rate and false negative rate. For most algorithms, recall and precision values for scene change varies from 70-90% depending upon the content of the video. Many algorithms are sensitive to motion of object and camera, like zooming and panning. Luminance variance results in scenes to be incorrectly segmented like in cases of excessive brightness change or flickering. Some algorithms fail in case of scene change surrogated by frames of high motion. Algorithms do not consistently perform in cases like a cut, a fade, a dissolve or a wipe. A cut is a hard boundary. A fade is an effect of scene transition where it lasts for few frames. Fade in and fade out are two different kind of fades. A dissolve is a synchronous occurrence of fade in and fade out. A wipe is a scene transition event when a virtual line going on the screen clears the old scene and displays the new scene.
Thus, there is a need to overcome the problems of the existing technology. Therefore, the present inventors have developed computer-implemented methods, systems and computer-readable media for detecting scene changes in a video, which would propose an efficient 2-Pass Abrupt Scene Change Detection (2PASCD) algorithm. It would identify abrupt scene changes in the video efficiently and also identify those scenes which are incorrectly segmented as two different scenes and combine them.