Image processing applications may estimate motion associated with various features in the image frames of a moving image sequence. The applications may estimate motion associated with blocks, segments, or other regions which may contain features. The estimated motion may describe a spatial transformation of a feature or region, from one frame to another. The transformation may describe a translation, rotation, warp, or other spatial transformation. The estimated motion may describe a non-spatial transformation of a feature or region. The transformation may describe intensity or color change, blurring, or other non-spatial transformation. As used herein, the terms ‘motion estimates’ and ‘motion estimation’ may refer to such estimates of motion. As used herein, the terms ‘features’, ‘blocks’, ‘segments’ and ‘regions’ may refer to that characteristic of the moving images to which the motion estimate is associated. As used herein, the term ‘region’ may refer to a block, segment, or other distinct area of an image frame, which may contain one or more features of an image.
Motion estimates are used in applications that relate to video, computer imaging, medical imaging and other somewhat more specialized image processing applications. Motion estimates are used with two dimensional (2D) imaging applications, as well as with three dimensional (3D) applications. Herein, the terms 2D and 3D refer to spatial dimensions.
Applications may include or involve video compression, which relates to reducing the amount of data with which visual information is stored and conveyed (e.g., encoded, transmitted, received and decoded). Motion estimates are used with video compression applications to achieve significant reduction in the data needed to represent image frames in moving image sequences. A video compression application may attempt to map, from one frame to another, translational or other motion of image regions. For instance, given a reference frame A and a motion map that describes image motion from frame A to a subsequent frame B, a motion-predicted frame Bm can be formed by projecting the motion map from frame A. A difference frame Bd can be formed by subtracting the motion-predicted frame Bm from frame B. Compression is achieved when the amount of data needed to encode both the motion map and the difference frame Bd is less than the amount needed for encoding frame B directly. Thus, an application may seek a motion map that yields a motion-predicted frame Bm that differs relatively little from frame B. For compression related purposes, the accuracy with which the motion map represents the actual motion of image features in the moving image sequence may not be a primary consideration. In other words, from the perspective of achieving compression, it may suffice that a given motion map simply reduces the amount of data needed to encode the motion map and the difference frame Bd.
Applications that use motion estimation may align an image feature which appears in one or more frames of a moving image sequence, to a reference. A motion estimate may describe the motion of a region containing a feature as it moves over a set of frames temporally proximate to a reference frame. The motion estimate may describe the transformation of the region from the reference frame to each of the other frames in the set. To align the feature from each frame, an inverse transformation may be applied to the related region of each other frame to reverse or undo the motion described by the motion estimate. The resulting set of image feature-aligned regions may then be blended or combined with each other, according to a formula. As used herein, the term ‘align’ may relate to inverting or undoing motion described by a motion estimate for a region containing a feature, persisting in one or more frames, to align the feature in each frame with the same feature in a reference frame. For applications that align image features, the accuracy with which the motion estimate represents the actual motion of image features in the moving image sequence is a primary consideration. Accurate motion estimation can be significant for accurately aligning features between various frames of a moving image sequence.
The importance of motion estimation accuracy becomes apparent with applications for which precise inter-frame alignment of image features is significant. These applications may include super-resolution, frame-rate conversion, motion-compensated de-interlacing and motion-compensated noise reduction. For super-resolution applications, relatively precise alignment, and thus accurate motion estimation, may become particularly significant. The significance of accurate motion estimation is not limited to such applications, however. It should be appreciated that accurate motion estimation may be significant with virtually any 2D or 3D video, medical imaging and computer imaging application.
In an example motion estimation approach, a motion estimate describing frame-to-frame translational motion of image regions may be found for an image sequence. To find the motion estimate for the sequence, a motion estimate is found for each region of each frame of the sequence. For a particular region, the frame containing the region is the reference frame and the motion estimate describes motion of the region over a set of frames temporally proximate to the reference frame. A region C may be selected in the reference frame and a likely position for a related region C′, having the same image features, may be sought in another frame of the frame set. In seeking the C′ region, region C may be translated over some range to many different positions and compared to regions at those positions in the other frame. The region found to be most like translated region C is region C′. As this process is repeated for each other frame of the set, the motion estimate for the region is found. For a 2D image, a translation may be represented by two components, e.g., an “x” component and a “y” component. The two components taken together may be referred to as a motion vector. One motion vector may be found for each frame of the set. By ordering the motion vectors for each frame of the set according to each frame's temporal position within the set, two discrete functions of the variable “t” (time) can be formed, e.g., x(t) and y(t). The two functions correspond to the two components of motion. The two functions, taken together, are the motion estimate for the region and they detail the translational motion of the region over the set of frames. It should be understood that, in this example, translational motion, represented by two components, is described.
For an example of more complicated motion, describing, e.g., translation, rotation, warp, and intensity change, regions are transformed from one frame to another and more than two components are needed to represent the motion. Again using regions C and C′ to represent related regions in two frames, region C of the reference frame may be transformed over some range of all of its components and compared to regions in the other frame. Again, the region found to be most like transformed region C is region C′ and the process is repeated for each frame of the set. The transformation described may be represented with seven components. The components taken together may be referred to as a motion vector. By ordering the motion vectors according to each frame's temporal position within the set, seven discrete functions of the variable “t” (time) can be formed. The seven functions, taken together, are the motion estimate for the region and they detail the complicated motion of the region over the set of frames.
More generally, motion estimation refers to a description (e.g., quantitative) of how motion vectors and/or other motion estimates map a region of a reference frame to one or more other frames. The regions may be distinct spatially. The motion estimate relates a region of the reference frame to regions of other frames, e.g., within a temporally proximate window about the reference frame. For each region in the reference frame, a search for similar regions may be performed on one or more other frames, which may be located within some displacement (e.g., a certain number of frames, temporal distance) from the reference frame. As used herein, the term ‘motion estimate’ may refer to motion estimates comprised of any number of component functions with any particular component describing a spatial or non-spatial attribute of the motion. As used herein, the term ‘motion estimate’ may refer to components that are sampled over time, e.g., over a set of frames defining a temporal window; the samples or frames may or may not be equally spaced in time. The components may be sampled over another variable, provided that the motion estimate is described over that variable, e.g., sampled over distance or temperature.