In motion estimation, also referred to as optical flow estimation and displacement estimation, the correspondences between areas in different video frames, also referred to as images, in a video sequence may be determined. The motion of objects in the actual scene captured in the video sequence, in addition to camera motion, may result in moving visual patterns in the video frames. A goal of true motion estimation may be to estimate the two-dimensional (2D) motion of a visual pattern from one frame to another such that the estimated 2D motion may be the projection of the actual three-dimensional (3D) scene motion. The estimated motion field may be used in applications in many areas, for example, video processing, video coding, computer vision and other video and imaging areas. Exemplary applications may include motion-compensated video coding, motion-compensated video filtering and motion-compensated frame interpolation.
Gradient-based motion estimation may be one important class of motion estimation methods. Another important class of motion estimation methods may be block matching. In gradient-based motion estimation, local motion may be modeled as substantially constant in a neighborhood proximate to a pixel location. Where a motion vector may be estimated. The neighborhood may be referred to as a local analysis window, analysis window or window. Spatial and temporal derivative values, also referred to as spatio-temporal gradients, of the pixel data in the window may be determined and used to compute a motion vector, a displacement vector or other parameters corresponding to the associated motion.
The potential presence of multiple objects within an analysis window may generate problems with a gradient-based motion estimation approach, wherein local motion may be modeled to be substantially constant in a neighborhood, due to the possibility of each of the multiple objects being associated with differing motion within the captured scene. The presence of multiple motions within the analysis window may lead to inaccurate estimates of the motion vector, or other motion parameters, being estimated.
Additionally, the data within an analysis window may comprise one or more noise components due to, for example, camera noise, compression noise or other noise. The noisy data within an analysis window may lead to inaccurate motion vector, or other motion parameter, estimates. This problem may be especially apparent when the analysis window is not sufficiently large enough to ensure accurate motion estimation.
Typically, the size and shape of a local analysis window is held constant. In a few techniques, the window size may be varied in an adaptive manner. However, in these techniques, motion estimation is performed for all candidate window sizes with a resulting motion vector, or other motion parameters, being selected, according to some criterion, from the results associated with the candidate windows. Thus, for these techniques, there may be a considerable increase in the required processing time or resources.
Samples, also referred to as pixels, within a local analysis window may typically be weighted equally or weighted based on their distance from the center sample in the window. Weighting according to these methods may be referred to as weighting by fixed window functions, and may be considered non-data-adaptive weighting.
Systems and methods for motion estimation that provide solutions to the above-mentioned problems associated with gradient-based motion estimation, in particular, solutions incorporating adaptive window size, solutions robust to noise and solutions that account for the presence of multiple objects and multiple motions, may be desirable for many important video processing applications.