1. Field of the Invention
This invention pertains to video image sequence editing, and more particularly to automatic scene change detection.
2. Description of Related Art
Scene change detection in video systems has wide implications to visual information products. In the typical editing process for a video production the director relies on a written log which describes the contents of each scene in the raw video footage and connects that information with the location on a tape by means of a time code number. An operator must manually produce the log by viewing the tape and making notations whenever a scene change occurs.
Another operation frequently preformed in both viewing and editing video tapes is fast-forward and rewind. The user may fast-forward or rewind for the purpose of finding a particular scene. In prior art systems that may be done either at a very high speed which does not allow viewing the images on the tape or at an intermediate speed which is not much faster than normal viewing speed. At high speed the user would guess the location of a scene and stop the tape at that location. Usually the desired scene is missed. However, the intermediate search speed is also unsatisfactory because the user must continuously view the tape in an attempt to locate the desired scene. Furthermore, the intermediate speed is limited by the rate at which images may be comprehended by human eyes and minds. It would therefore be useful for users to fast-forward/rewind on a scene-by-scene basis.
Scene change detection relies in part on detecting motion in a sequence of images. This area has seen much research and product development, primarily for defense applications of Automatic Target Recognition (ATR). Real-time and reliable motion extraction over a wide range of noise and scene conditions from visual motion, however, has not been attained (Waxman, A. et al., Convected Activation Profiles and the Measurement of Visual Motion, CH2605-4/88/0000/0717, IEEE 1988 and Verri, A., and T. Poggio, Against Quantitative Optical Flow, CH2465-3/87/0000/0171, IEEE 1987).
Merely detecting motion is quite simple when there is no camera motion or illumination changes. A pixel-to-pixel difference of two successive images followed by a threshold operation yields a motion gradient. If the object moves, a large difference indicates motion has occurred. Unfortunately, if the camera moves, or if the sensed illumination of a pixel changes, as when induced by shadows or clouds, a significant false motion gradient results. If one knows the camera has moved or noise is present, these effects can be removed (Thompson, W. and T. Pong, Detecting Moving Objects. CH2465-3/87/0000/0201, IEEE 1987). However, in video scenes of unknown origin, like that on a video tape, this information is not available.