A well known technique for locating a single moving object (undergoing coherent motion), contained in each of successive frames of a motion picture of an imaged scene, is to subtract the level value of each of the spatially corresponding image data pixels in one of two successive image frames from the other to remove those pixels defining stationary objects in the given scene and leave only those pixels defining the single moving object in the given scene in the difference image data. Further, by knowing the frame rate and the displacement of corresponding pixels of the single moving object in the difference image data, the velocity of the single moving object can be computed. In order to facilitate such processing of the image data in each of the successive frames, it is usual to first convert it to digital form.
However, when the image data of the motion-picture successive fames define two motions, the problem is more difficult. Consider an imaged scene comprising a background region which moves with a certain global velocity pattern in accordance with the movement (e.g., translation, rotation and zoom) of the motion-picture imaging camera recording the scene. In this case, a scene-region occupied by a foreground object that is locally moving with respect to the background region will move in the motion picture with a velocity which is a function of both its own velocity with respect to the background region and the global velocity pattern of the background region itself.
Assuming that a video camera is being used to continuously derive such a motion picture in real time, the problem is to employ, in real time, the image data in the series of successive frames of the motion picture to (1) remove the effects (including those due to parallax) of the global motion and (2) detect and then track the locally-moving foreground object to the exclusion of this global motion.
A conventional general image-motion analysis technique is to compute a separate displacement vector for each image pixel of each frame of a video sequence. This is a computationally challenging task, because it requires pattern matching between frames in which each pixel can move differently from one another.
More recently, a so-called "majority-motion" approach has been developed for solving the aforesaid problem in real time. This "majority-motion" approach and its implementation are disclosed in detail in the article "Object Tracking with a Moving Camera-an Application of Dynamic Motion Analysis," by Burt et al., appearing in Proceedings of the Workshop on Visual Motion, Irvine, Calif., Mar. 20-22, 1989, which is published by The Computer Society of the IEEE. Further, certain improvements of this "majority-motion" approach are disclosed in detail in the article "A Practical, Real-Time Motion Analysis System for Navigation and Target Tracking," by Burt et al., Pattern Recognition for Advanced Missile Systems Conference, Huntsville, Nov. 14-15, 1988.
All the specific approaches disclosed in these two Burt et al. articles rely on segmenting the image data contained in substantially the entire area of each frame into a large number of separate contiguous small local-analysis window areas. This segmentation is desirable to the extent that it permits the motion in each local-analysis window to be assumed to have only its own computed single translational-motion velocity. The closer the size of each local-analysis window approaches that occupied by a single pixel (i.e., the greater the segmentation), the closer this assumption is to the truth. However, in practice, the size of each local-analysis window is substantially larger than that occupied by a single image pixel, so that the computed single translational-motion velocity of a local-analysis window is actually an average velocity of all the image pixels within that window. This segmentation approach is very artificial in that the periphery of a locally-moving imaged object in each successive frame is unrelated to the respective boundary borders of those local-analysis windows it occupies in that frame. If it happens to occupy the entire area of a particular window, the computed single translational-motion velocity for that window will be correct. However, if it happens to occupy only some unresolved part area of a particular window, the computed single translational-motion velocity for that window will be incorrect. Nevertheless, despite its problems, the "majority-motion" and other approaches employing segmentation disclosed in the aforesaid Burt et al. articles are useful in certain dynamic two-motion image analysis, such as in removing the effects of the global motion so that a locally-moving foreground object can be detected and then tracked to the exclusion of this global motion.