U.S. Pat. No. 5,956,076, granted Sep. 21, 1999, for xe2x80x9cSystem and Method for Hierarchical Summarization and Browsing,xe2x80x9d of Krishna Ratakonda.
This invention relates to detecting either transitions in digital video sequences, wherein the transitions include dissolves, fades, and dissolves including fades.
A transitions is a special visual effect that softens an abrupt scene change in a video sequence. Dissolve transitions are periods of time where the content of a video sequence gradually changes from one scene to another usually according a pre-defined pattern in time. It is of common occurrence in professional footage and consumer video shot by advanced users of high-end cameras. Within a dissolve region, the next scene appears (fades in) while the first scene fades out, and by the end of the dissolve region the next scene replaces the first. Dissolves may also occur as a combination of a fade out to a blank screen followed by a fade in to the second scene. Dissolve regions may be of varying duration, from 1 sec. in professional videos to about 6 sec. in videos shot by recently available digital camcorders, such as Sharp(copyright) ViewCam(copyright), model VL-DC1.
The purpose of detecting dissolve and fade transitions is to pre-process uncompressed or compressed digital video sequences prior to preparing a video summary, which will enable a user quickly to review a number of video sequences, or during editing, to avoid undesirable effects such as detection of spurious keyframes by video editing/indexing systems during a dissolve or a fade. Such spurious keyframes have little value in video summaries.
Fades are generally a transition between a scene and a given color, usually black or white. A fade region is called a fade-in when the transition goes from a fixed color to a video scene and it is called a fade-out when the transition goes from a video scene to a fixed color. Color, as used herein, includes black and white. Dissolve transitions may include a intermediary fade-to-white, fade-to-black or fade-to-gray phase. The first and the last image in a dissolve or a fade transition are called xe2x80x9canchor frames.xe2x80x9d The video scenes in the dissolve or fade transition may feature either static or moving content.
The known prior art is generally concerned with detecting dissolve and fade regions in a statistical setting. Furthermore, the prior art cited below does not make any references to the capability to work directly in a compressed video domain. Previous work in this area typically assumes a model for the dissolve i.e., a model for the variation in the image intensities. Such work assumes that a dissolve results in a linear change of intensities between the anchor frames, which frames are the first and last frames in the dissolve event. It may then be shown that the intermediary frames between the anchor frames have a parabolic profile in terms of standard deviation of intra-frame intensities, i.e., the plot of frame number vs. intra-frame standard deviation for intermediary frames has a parabolic profile. This profile is used as a signature to parse for a dissolve in the video sequence. However, it may be noted that this profile may occur in other parts of the sequence not associated with a dissolve. In order to remove such spurious dissolve detections, known techniques limit the maximum duration of a dissolve to under one second. This artificial limitation may not be satisfied in practice. Additionally, the linear model may not be always satisfied. A dissolve obtained from a camcorder, such as Sharp(copyright) ViewCam(copyright), model VL-DC1, is typically piece-wise linear. Analog camcorders, which use capacitive circuitry, may produce a different profile of a dissolve altogether because capacitance changes exponentially.
Aigrain et al, xe2x80x9cThe Automatic Real-Time Analysis of Film Editing and Transition Effects and its Applications,xe2x80x9d Computer and Graphics, Vol. 18, No. 1, pp 93-103 (1994) propose statistical models for detecting cross-dissolves, fade-in""s and fade-out""s. These models are built on the assumption that the transitions are linear. The case where a fade-to-gray transition is part of a dissolve transition is not considered.
U.S. Pat. No. 4,319,286, to Hanpachern describes circuitry which detects a temporary loss of video and audio signals. The patent describes a xe2x80x9ccommercial killerxe2x80x9d which captures the rapid fade-to-black transition that occurs before a commercial in continuous, non-sampled, digital video signals.
U.S. Pat. Nos. 5,245,436 and 5,283,645, both to Alattar, describe sampled digital video inputs. U.S. Pat. No. 5,283,645 describes a statistical framework for detecting dissolves. The proposed method assumes that the dissolve transition is linear in time. U.S. Pat. No. 5,245,436 describes a mechanism for detecting fade-in""s, transition from a solid color like black to a moving video scene, or fade-out""s, transition from moving video scene to a solid color like black, based on measuring the mean difference and the relative mean change between consecutive video frames. The decision whether a fade occurs or not is made on a frame by frame basis and relies on comparing the overall image mean value variation against a set of pre-defined values.
A method of detecting transitions in a video sequence includes inputting a digital video sequence into a video processor; detecting a monotonically varying image intensity profile of the digital video sequence; and tagging the digital video sequence associated with such an intensity profile as a transition event.
The invention is a method for detecting a dissolve which overcomes most of the difficulties in the known prior art. The method is independent of the model adapted for dissolve creation and is more resistant to spurious dissolves. The method is also resistant to limited motion within the dissolve sequence.
An object of the invention is to provide a new technique for detecting a dissolve event in a video sequence.
Another object of the invention is to provide a new technique for detecting a dissolve event in a video sequence that is functional with both uncompressed digital video and DCT-based compressed video, such as JPEG and MPEG.
Another object of the invention is to provide a new technique for detecting a dissolve event in a video sequence that is functional with MPEG-2 compressed video, and wherein dissolve event detection is performed with minimal decoding of the MPEG-2 compressed bitstream.
Another object of the invention is to provide a new technique for detecting a dissolve event in a video sequence that provides accurate dissolve/fade detection in the presence of noise.
It is an object of the invention is to provide a new technique for detecting a dissolve event in a video sequence that is insensitive to scene motion.
A further object of the invention is to provide a unified, fast and yet, robust method for detecting dissolve events, including fade-in""s and fade out""s, in sampled digital video sequences.
Yet another object of the invention is to provide such detection capability independently from the mechanisms or the models used to generate the dissolve events.