Techniques for recognizing pattern shapes of objects graphically represented in image data are known in the art. Further, techniques for discriminating between moving and stationary objects having a preselected angular orientation, or objects having any other predetermined feature of interest, are also known in the art.
A well known technique for locating a single moving object (undergoing coherent motion), contained in each of successive frames of a motion picture of an imaged scene, is to subtract the level value of each of the spatially corresponding image data pixels in one of two successive image frames from the other to remove those pixels defining stationary objects in the given scene and leave only those pixels defining the single moving object in the given scene in the difference image data. Further, by knowing the frame rate and the displacement of corresponding pixels of the single moving object in the difference image data, the velocity of the single moving object can be computed. However, when the image data of the successive frames define two motions, for example a background region which moves with a certain global velocity pattern in accordance with the movement (e.g., translation, rotation and zoom) of the camera recording the scene, the problem is more difficult. In this case, a scene-region occupied by a foreground object that is locally moving with respect to the background region will move in the motion picture with a velocity which is a function of both its own velocity with respect to the background region and the global velocity pattern of the background region itself. The global velocity pattern due to motion of the image sensor can be very complex since it depends upon the structure of the scene.
A problem is to employ, in real time, the image data in the series of successive frames of the motion picture to (1) measure and remove the effects (including those due to parallax) of the global motion and (2) detect and then track the locally-moving foreground object to the exclusion of this global motion. A conventional general image-motion analysis technique is to compute a separate displacement vector for each image pixel of each frame of a video sequence. This is a computationally challenging task, because it requires pattern matching between frames in which each pixel can move differently from one another. More recently, a so-called "majority-motion" approach has been developed for solving the aforesaid problem in real time. This "majority-motion" approach and its implementation are disclosed in detail in the article "Object Tracking with a Moving Camera-an Application of Dynamic Motion Analysis," by Burt et al., appearing in Proceedings of the Workshop on Visual Motion, Irvine, Calif., Mar. 20-22, 1989, which is published by The Computer Society of the IEEE. Further, certain improvements of this "majority-motion" approach are disclosed in detail in the article "A Practical, Real-Time Motion Analysis System for Navigation and Target Tracking," by Burt et al., Pattern Recognition for Advanced Missile Systems Conference, Huntsville, Nov. 14-15, 1988.
The specific approaches disclosed in these two Burt et al. articles rely on segmenting the image data contained in substantially the entire area of each frame into a large number of separate contiguous small local-analysis window areas. This segmentation is desirable to the extent that it permits the motion in each local-analysis window to be assumed to have only its own computed single translational-motion velocity. The closer the size of each local-analysis window approaches that occupied by a single pixel (i.e., the greater the segmentation), the closer this assumption is to the truth. However, in practice, the size of each local-analysis window is substantially larger than that occupied by a single image pixel, so that the computed single translational-motion velocity of a local-analysis window is actually an average velocity of all the image pixels within that window. This segmentation approach is artificial in that the periphery of a locally-moving imaged object in each successive frame is unrelated to the respective boundary borders of those local-analysis windows it occupies in that frame. If it happens to occupy the entire area of a particular window, the computed single translational-motion velocity for that window will be correct. However, if it happens to occupy only some unresolved part of a particular window, the computed single translational-motion velocity for that window will be incorrect. Nevertheless, despite its problems, the "majority-motion" and other approaches employing segmentation disclosed in the aforesaid Burt et al. articles are useful in certain dynamic two-motion image analysis, such as in removing the effects of the global motion so that a locally-moving foreground object can be detected and then tracked to the exclusion of this global motion.
For many problems in computer vision, it is important to determine the motion of an image sensor using two or more images recorded from different viewpoints or recorded at different times. The motion of an image sensor moving through an environment provides useful information for tasks like moving-obstacle detection and navigation. For moving-obstacle detection, local inconsistencies in the image sensor motion model can pinpoint some potential obstacles. For navigation, the image sensor motion can be used to estimate the surface orientation of an approaching object like a road or a wall.
Prior art techniques have recovered image sensor motion and scene structure by fitting models of the image sensor motion and scene depth to a predetermined flow-field between two images of a scene. There are many techniques for computing a flow-field, and each technique aims to recover corresponding points in the images. The problem of flow-field recovery is not fully constrained, so that the computed flow-fields are not accurate. As a result, the subsequent estimates of image sensor motion and three-dimensional structure are also inaccurate.
One approach to recovering image sensor motion is to fit a global image sensor motion model, to a flow field computed from an image pair. An image sensor motion recovery scheme that used both image flow information and local image gradient information has been proposed. The contribution of each flow vector to the image sensor motion model was weighted by the local image gradient to reduce errors in the recovered image sensor motion estimate that can arise from local ambiguities in image flow from the aperture problem.
There is, however, a need in the art for a method and apparatus to accurately determine the motion of an image sensor when the motion in the scene, relative to the image sensor, is non-uniform. There is also a need in the art for a method and apparatus to accurately determine the structure of the scene from images provided by the image system. A system possessing these two capabilities can then automatically navigate itself through an environment containing obstacles.