1. Field of the Invention
The present invention relates to a method for detecting a gradual shot transition used for editing a video stream, and more particularly to a wipe and special effect detection method based on the spatio-temporal distribution and variance characteristics of macro blocks.
2. Background of the Related Art
[Video Indexing and Non-linear Browsing Method]
Development of digital video and image/video recognition technologies contributes the searching/filtering and browsing of a video in part wanted by a user. The basic techniques for the non-linear video browsing and search are shot segmentation and shot clustering, which are the cores of analyzing a video.
Therefore, many studies have been made to the shot segmentation, while studies for the shot clustering are launched.
FIG. 1 shows an example of a non-linear video browsing interface.
Numerals 100, 101, 102, and 103 designate a table of contents, a display screen, a frame representing a specific interval, and a graphic of function keys for browsing.
Using an interface (table of contents interface: TOC interface) as shown in FIG. 1, a user enables to access an interesting part of a video without viewing an entire video. Such an interface is greatly useful for a digital video browsing method. It is known that shot segmentation and shot clustering are very important for this video browsing.
[Relations Between Shot Segmentation And Shot Clustering]
FIG. 2 shows a diagram illustrating the relation between shot segmentation and shot clustering.
A video stream is divided into scene units (each of which may be further divided into sub scenes). The respective scenes are divided into shots each of which consists of a sequence of video frames.
Shot segmentation is a technique for dividing a video into the respective individual shots. And, shot clustering is a technique of constituting a video structure with logical scene units by combining similar shots together in the individual shots on the basis of the time/image/motion/audio similarity.
[Shot Transition and Scene]
A shot means a sequence of video frames attained from a camera without interruption and is a basic unit for the analysis and construction of a video.
Generally, a video is constructed with connections of lots of shots. And, the shot connection is achieved by using various editing effects. The most widely used shot transition method uses hard cut. The video editing effects are mainly divided into abrupt shot transition (hard cut) and gradual shot transition. Further, the gradual shot transition methods are fade, dissolve, wipe, other special effects and the like.
Compared to abrupt shot transition, gradual shot transition is seldom used in general. Yet, gradual shot transition has special meanings different from those of abrupt shot transition.
For instance, the technique of fade-in/out mostly means the past review or the beginning of a new scene. And, the dissolve editing effect is mostly used for the scene transition as shot transition of a large unit as well as the other shot transition of a small unit.
In this case, a scene, which is constructed with a plurality of shots or sub scenes, means a construction unit of a logical story. Wipe and other special effects, which are widely applied to shot transition of scene units, belong to one of the editing effects as well.
When information about a program genre or content characteristics is known, detection of the gradual shot transition may be used as a very important clue for the segmentation of a video stream into units of logical story construction.
Therefore, the detection of gradual shot transition is important for the development of a shot clustering method as well as shot segmentation.
[Related Art 1: Twin Comparison Method]
It has been reported that shot segmentation using global color distribution of a color histogram based method is the most excellent through various study results.
Yet, the shot segmentation method using global color distribution based on color histogram is very efficient in detecting abrupt shot transition but very poor at detecting gradual shot transition.
Accordingly, many efforts have been made to study for detecting gradual shot transition.
Zhang et al. proposed a twin comparison method for distinguishing and detecting abrupt shot transition and gradual shot transition. A twin comparison method uses a method of distinguishing abrupt shot transition from gradual shot transition by establishing two threshold values and comparing a size of picture difference between frames to the two threshold values.
Unfortunately, this method fails to distinguish various gradual shot transition methods and brings about many false alarms and miss alarms due to sensitivity to a camera motion or an object movement. And, an execution speed of the method is slow since this method requires picture differences between neighboring frames continuously.
[Related Art 2: Method Using Edge Image Picture Difference]
W. Wolf et al. proposed a multi step wipe shot transition detection method based on an edge variance statistics attained by pixel unit processing and a picture difference between frames of pixel unit.
The method of W. Wolf et al. is characterized in that an area where movement is detected and a characteristic of the wipe shot transition technique is compares to a modeling.
However, this method requires image decoded by frame unit for edge transition and needs to scan all the frame data for detecting the edge, thereby requiring lots of processing.
[Related Art 3: Edge Change Fraction and Method Using The Same Transition]
R. Zabih et al. proposed a method, which surveys a ratio between entering edge and exiting edge and detects and classifies wipe and shot transition on the basis of the variance ratio.
Unfortunately, this method requires entire frames to be decoded at a level of a picture for detecting edge and needs image unit matching to judge whether the edge is new entering edge, exiting edge, or the previous edge, thereby reducing the processing speed.
[Reasons for the Requirement of Processing in Compressed Domain: Execution Speed Matter]
In general, moving pictures data have problems in storage and transmission due to the large capacity. In order to overcome this capacity problem, a scheme of compressing data by various image processing techniques and restoring images is used. MPEG is the most widely representing compression method.
Lately, methods are developed for performing shot segmentation directly in a compressed domain using the characteristic of the compression technique of the MPEG stream without decoding a MPEG-compressed video to a picture level. A major reason for carrying out shot segmentation in a compressed domain is the application to a real time system or the fast indexing of large capacity multimedia database. The performance of the shot segmentation method in the compressed domain is similar to that of a method in a non-compressed domain, and a performance speed of the shot segmentation method in the compressed domain is relatively fast.
[Disadvantage Summary of the conventional method and Task of the Present Invention: Method Having Excellent Performance and Fast Performance Speed]
Briefly, the previous studies result in slow method performance speed for the application to real time environment, thereby failing to be applied to a real time video indexing system as well as having poor detection performance.