The present invention generally relates to a method for describing the incidental data of motion picture image scenes necessary at the time of editing video or movie picture images and, more particularly, to a camera work detecting method for retrieving a target motion picture image frame from a motion picture image stored in video tape or video disk to realize efficient picture image editing.
There is disclosed a prior art editing technique for automatically detecting a change point in a picture image from a motion picture image stored in video tape or the like to detect a scene transition and thereby facilitate the tape-head searching operation during editing of the motion picture image, in U.S. Pat. No. 5,083,860 by the same applicant as the present application. This editing system, with use of incidental data on the motions of a video camera itself, i.e., on such camera works as zooming and panning motions of the video camera for each detected scene, can support such a motion-picture image retrieving work, for example, as "want to see a scene that panning is carried out from right to left" or "want to see a scene immediately after zooming up" or such a motion-picture image editing work as, for example, "want to correct the irregular panning speed" or "want a higher panning speed". To this end, a technique for detecting such camera works as the zooming and panning motions of the video camera and for automatically describing incidental data indicative of such motions is required.
Examples of such known techniques for detecting the motions of a video camera itself are disclosed in an article titled "DIFFERENTIAL ESTIMATION OF THE GLOBAL MOTION PARAMETERS ZOOM AND PAN" in a magazine "SIGNAL PROCESSING" 16 (1989) and in JP-A-2-157980.
The former is directed to presentation of basic analysis models for zooming and panning motions, while the latter is directed to correction of an unintentional picture image movement. In the above JP-A-2-157980, a plurality of, e.g., 4 or so of relatively large detection zones are set on a display screen, a motion vector between frames is found on the basis of a correlation value based on a predetermined displacement, and the motion vector on the display screen is determined on the basis of the state of the correlation value and the reliability judgement result of a previous motion vector. A motion vector based on the vibration of the video camera vibrates around a zero vector and when a moving object appears on the display screen, the motion vector is found by adding together the motion of the moving object and the motion of the video camera. The then motion vector corresponds to an addition of the motion vector based on the vibration and a constant value. Accordingly, the motion vector based on the vibration can be separated and thus the unintentional undesirable picture image movement based on the vibration of the video camera can be corrected on the basis of only the motion vector based on the vibration.
According to the prior art techniques, a motion vector indicative of a slow motion of a camera itself cannot be accurately detected, which results in that such camera works as panning and zooming motions cannot be automatically detected, thus disabling efficient picture editing operation.
More in detail, the system disclosed in JP-A-2-157980 is intended to detect the vibration of the video camera and fails to pay any consideration to the detection of such camera works as panning and zooming motions. For example, a picture image obtained when the video camera is panned from right to left is judged as a moving object appearing on the display screen and cannot been regarded as a camera motion. Further, a picture image obtained when the camera is zoomed up or down is detected as motion vectors pointed radially from the center of the screen. However, a means for converting the detected motion vectors into a zooming magnification value is not provided in the system. Furthermore, the panning or zooming motion of the video camera is slower in speed than the undesirable hand movement of the camera operator. For this reason, the aforementioned system for determining a motion on the screen through one motion, detection cannot provide a sufficient detection accuracy.