This invention relates to motion estimation within a sequence of data frames, and more particularly to an improved optical flow method of deriving motion estimation.
Extracting motion information from a sequence of data frames is desirable for many applications in the fields of computer vision and data compression. In the context of computer vision, the data frames are frames of image data. Motion estimation is beneficial for many computer vision applications, including but not limited to (i) the estimation of three-dimensional scene properties, (ii) visual sensor motion estimation, (iii) motion segmentation, (iv) computing focus of expansion and time to collision of objects, (v) performing motion compensated image encoding, (vi) computing stereo disparity, (vii) measuring blood flow and heart wall motion in medical imagery, and (viii) even for the measuring of minute amounts of growth in seedlings.
Typically motion analysis involves a first stage in which the optical flow is measured in a sequence of two-dimensional image frames. A subsequent second stage involves deriving actual motion of image objects in three-dimensional space, or inference of some other higher level computer vision tasks from the computed optical flow. Optical flow is a measure of the apparent motion of a brightness pattern. More specifically optical flow is a distribution function of apparent velocities of movement of brightness patterns within a sequence of images. The image frame is a two dimensional array of pixels representing, perhaps, a three dimensional image. The image may include objects or components which move at differing velocities and in differing three-dimensional directions. The projection of three-dimensional surface-point velocities onto the two-dimensional viewable image plane of a display is approximated by the measures of optical flow for the differing portions of the image.
In xe2x80x9cDetermining Optical Flow,xe2x80x9d by B. K. P. Horn and B. G. Schunck, Artificial Intelligence, Vol. 17, pp. 185-204, 1981 a method for finding the optical flow pattern is presented which assumes that the apparent velocity of the brightness pattern varies smoothly almost everywhere in the image. Their computation is based on the observation that the flow velocity has two components and that the basic equation for the rate of change of image brightness provides only one constraint. Smoothness of the flow was introduced as a second constraint to solve for optical flow. Such smoothness constraint presumes there are no spatial discontinuities. As a result, Horn and Schunck excluded situations where objects occlude one another. This is because at object boundaries of an occlusion, discontinuities in reflectance are found.
The Horn and Schunck method is described in more detail below. Consider the image brightness at pixel (x,y) in the image plane at time t to be represented as the function I(x,y,t). Based on initial assumptions that the intensity structures of local time-varying image regions are approximately constant under motion for at least a short duration, the brightness of a particular point is constant, so that dI/dt =0. Based on the chain rule of differentiation, an optical flow constraint equation (I) can be represented as follows:
Ix(x,y,t)xc2x7u+Iy(x,y,t)xc2x7v+It(x,y,t)=0xe2x80x83xe2x80x83(I)
where
Ix=∂I(x,y,t)/∂x=horizontal spatial gradient of the image intensity;
ly=∂I(x,y,t)/∂y=vertical spatial gradient of the image intensity;
It=∂I(x,y,t)/∂t=temporal image gradient of the image intensity;
u=dx/dt=horizontal image velocity (or displacement); and
v=dy/dt=vertical image velocity (or displacement).
The optical flow equation (I) is a linear equation having two unknowns, (i.e., u, v). The component of motion in the direction of the brightness gradient is known to be It/(Ix2+Iy2)xc2xd. However, one cannot determine the component of movement in the direction of the iso-brightness contours at right angles to the brightness gradient. As a consequence, the optical flow velocity (u,v) cannot be computed locally without introducing additional constraints. Horn and Schunck introduce the smoothness constraint. They argue that if every point of the brightness pattern can move independently, then there is little hope of recovering the velocities. However, if opaque objects of finite size are undergoing rigid motion or deformation, neighboring points on the objects should have similar velocities. Correspondingly, the velocity field of the brightness patterns in the image will vary smoothly almost everywhere. They admit however that such a smoothing constraint is likely to result in difficulties in deriving optical flow along occluding edges.
Given such smoothing constraint, the optical flow equation is solved by minimizing the sum of errors for the rate of change of image brightness. The total error to be minimized is:                                           min                          (                              u                ,                v                            )                                ⁢                                    ∫              D                        ⁢                                          (                                                                            I                      x                                        ·                    u                                    +                                                            I                      y                                        ·                    v                                    +                                      I                    t                                                  )                            2                                      +                                            α              2                        ·                                          (                                                      u                    x                    2                                    +                                      u                    y                    2                                    +                                      v                    x                    2                                    +                                      v                    y                    2                                                  )                            2                                ⁢                      xe2x80x83                    ⁢                      ⅆ            x                    ⁢                      ⅆ            y                                              (        II        )            
where D represents the image plane, ux, uy, vx, and vy are the velocity spatial gradients, and xcex1 is a parameter to control the strength of the smoothness constraint. The parameter xcex1 typically is selected heuristically, where a larger value increases the influence of the smoothness constraint.
The difficulty in handling incidents of occlusion is because image surfaces may appear or disappear in time, complicating and misleading tracking processes and causing numerical artifacts. Accordingly, there is a need for a method of estimating optical flow which is reliable even in the vicinity of occlusions.
According to the invention, the optical flow of an array of pixels in an image field is determined using adaptive temporal gradients and in some embodiments adaptive spatial gradients, so as to avoid artifacts at occlusions. In particular, artifacts are avoided for occlusions at image objects which are moving smoothly relative to the image field background, (e.g., generally constant velocity over the time periods from image frames kxe2x88x921 to k+1).
According to one aspect of the invention, data from three image frames are used to determine optical flow. A parameter, S, is defined and determined frame by frame which is used to determine whether to consider the data looking forward from frame k to k+1 or looking backward from frame kxe2x88x921 to frame k when initializing the spatial and/or temporal gradients for frame k. In particular, the parameter S signifies the areas of occlusion, so that the gradients looking backward from frame kxe2x88x921 to frame k can be used for such pixel regions. The gradients looking forward are used in the other areas.
According to another aspect of the invention, the temporal gradients are determined by convolving a symmetric matrix to avoid one-half time interval shifts in data between the backward-looking and forward-looking temporal gradients.
According to another aspect of the invention, the parameter S is convolved in some embodiments with a smoothing function to define a more generalized parameter (e.g., Sm).
According to another aspect of the invention, an embodiment of the motion estimation method is implemented in a system for image object tracking and segmentation. In a described embodiment the system includes (i) a modified adaptive resonance theory-2(M-ART2) model for detecting changes of scenes, (ii) a two-dimensional correlative autopredictive search (2D CAPS) method for object tracking, (iii) an edge energy derivation method and (iv) an active contour model with global relaxation for defining optimal image object boundaries. The motion estimation method allows edge energy to be estimated based, not just on the color components, but also on the motion vectors (Vx, Vy). The motion estimation derived for a previous frame of image data provides guidance for the CAPS object tracking analysis in a current frame. For example, the motion estimation is used in one embodiment to reduce the search area when looking for a template match during CAPS processing. Also, the motion estimation during an initial frame simplifies user interaction. For example, when a manual process is used to identify an initial object to be tracked, rather than having a user identify edge points, the user can simply click on any one or more moving points on the object to be tracked.
According to one advantage of this invention, optical flow is determined without substantial artifacts, even in the presence of occlusions, where optical flow changes smoothly. According to another advantage of this invention, optical flow motion estimation within the image object tracking and segmentation system improves, for example, MPEG-4 video encoding and content based video editing. These and other aspects and advantages of the invention will be better understood by reference to the following detailed description taken in conjunction with the accompanying drawings.