The present invention relates to a method and apparatus for video cut detection and, more particularly, to a method and apparatus for detecting cut points (points at which scenes are switched) from a plurality of image data sequences.
A point of switching a scene in video by a camera ON-OFF operation or cut break (including fade, wipe or the like) of video editing is commonly referred to as a cut point. Camera operations (pan, zoom and so forth) and object movements in video are not regarded as cuts. The detection of such video cut points is called a scene change detection, too, and various methods have been proposed.
A typical method proposed so far is a method in which intensity differences between two temporally successive images I.sub.t and I.sub.t-1 in a sequence of captured images at their respectively corresponding pixels are calculated and when the sum of absolute values of the intensity differences over the entire frame (which sum is commonly called an inter-frame difference), represented by D(t), is larger than a given threshold value, t is regarded as a cut point (Otsuji, Tonomura and Ohba, "Motion Picture Browsing Using Intensity information," Technical Report of Institute of Electronics, Information and Communication Engineers of Japan, IE90-103, 1991). In this instance, a pixel hanging area, intensity histogram difference, block-wise color correlation, or .chi..sup.2 test quantity may sometimes be used as D(t) in place of the inter-frame difference (Otsuji and Tonomura, "Studies of Automatic Video Cut Detection Method," Technical Report of Institute of Television Engineers of Japan, Vol. 16, No. 43, pp. 7-12). This method has a shortcoming of erroneously detecting a rapid object motion or flashlight in video as a cut.
There has also been proposed a method in which the inter-frame difference D(t) is not processed directly with the threshold value but instead a value obtained by processing the inter-frame difference D(t) with various time filters is subjected to the threshold processing (K. Otsuji and Y. Tonomura, "Projection Detecting Filter for Video Cut Detection," Proc. of ACM Multimedia 93, 1993, pp. 251-257). This method is almost free from the problem of false detection of a rapid object motion or flashlight in video.
The prior art possesses such problems as listed below.
A first problem is the incapability of detecting temporally slow scene changes. The reason for this is that according to the conventional cut point detection method, the quantity representing a scene changing ratio is computed from two temporally successive frames alone, hence does not substantially reflect a long-time scene change.
Typical slow scene changes or transitions are special effects that are inserted in videos at their editing stage, such as fade-in, fade-out, dissolve and wipe. The fade-in is a technique which gradually increases the video signal level to cause an image to appear little by little. The fade-out is a technique which gradually decreases the video signal level to cause an image to disappear little by little. The dissolve is a technique which decreases, at the time of transition from a scene A to B, the video signal level of the scene A while at the same time increasing the video signal level of the scene B, thereby causing the scene A to dissolve into the scene B. The wipe is a technique which wipes off the image of the scene A little by little while at the same time causing the image of the scene B to gradually emerge. Since with these special effects, scenes slowly change (complete switching of a scene, for example, takes one second or so), the change cannot be detected by the conventional method of comparing two temporally successive images or frames (spaced around 1/30 sec apart). The reason for this is that the difference between such two temporally successive images in the slow scene change is so slight that it is hard to discriminate if the difference is attributable to noise or the scene change.
A second problem is false detection of flashlight in video as a cut. When a flash is used, the intensity level of the image rises instantaneously, and consequently, the intensity difference between the images before and after the flash is used abruptly increases. Hence, with the conventional cut point detection scheme which regards a sudden increase in the intensity difference as a cut point, the flashlight is erroneously detected as a cut point.
A third problem is misdetection of cut points in telecine-transformed video. The telecine transformation is a film-video conversion which, for televising video stored on film, converts the frame rate of the film (24 frames/s) to the frame rate of the television signal (30 frames/s). According to some telecine transformation schemes, one frame is added for each four frames. Since the frame that is added in this case is produced by combining fields of images before and after the frame, the image change becomes small and the traditional cut point detection method is liable to incur the misdetection of the cut for the same reasons given above with respect to the dissolve.
To overcome the second and third problems mentioned above, there has been proposed a method for preventing the false detection and misdetection of cut points through utilization of a time filter. In the designing of this time filter, however, it is necessary to pre-check what kind of video is to be handled; hence, this method is not suitable for real time processing and prone to false detection of flashlight in telecine-transformed video.