It is desirable to be able to estimate the motion or displacement of an image segment from one video frame to a target video frame. Here the term ‘segment’ will be used throughout to represent an object, a block, or partial subset of an object. Such motion estimation enables substantial inter-frame compression by reduction of temporal redundancies in video data. Motion estimation is also often referred to as motion matching in that a given segment is ‘matched’ to a particular location in the target video frame.
Motion matching may typically involve identifying an object in the scene captured in one digital image, identifying that same object in another image and noting the position change from one image to the other. For example, where the video sequence being processed is a soccer match, the process might be used to detect the motion of an object such as the soccer ball. It should be noted that the matching processes described herein are not limited to actual objects in the scene, but might refer to pre-determined blocks or tessellations of the image or even to portions of objects. For example, a video sequence of a beach ball having several solid colored portions of differing colors might be processed with each different colored portion being treated as a different object.
While it need not be the case, matching is often an attempt to “track” an image segment in a video sequence as it moves within the frame of the video sequence. Thus, digital signal pattern matching can be used in various applications such as video compression, medical imaging and object tracking. For example, a digital image processor can determine how a segment moved from one image frame of a video sequence to the next image frame of the video sequence by noting the position of a segment in a first image frame, extracting that segment and matching it against a second image frame, noting the position of a corresponding (matched) segment found in the second image frame and using the difference between the positions as an indication of motion. Often, the motion between two frames of an N-dimensional sequence is described as an N-dimensional vector. Thus, where the video sequence is a sequence of two-dimensional images, the motion of a segment S can be expressed by the two-dimensional vector u=(Δx, Δy), where Δx is the relative displacement of the segment in the horizontal direction and Δy is the relative displacement of the segment in the vertical direction. Typically, the units of measurement for the displacements are in pixels.
Motion matching may be problematic and is rather difficult to do efficiently. One problem that often arises in motion matching routines is the occurrence of false matches. The false matches may have a variety of causes, including changes in lighting, sharpness, or even shape of the local object between frames. Another problem that arises is ambiguous matching. Ambiguous matching may occur when multiple displacement vectors result in the same or a similar match between the displaced segment (or object or block) and the underlying pixels. And yet only one of the similar choices is the desired or ‘physical’ displacement from the viewpoint of a standard observer. Furthermore, some motion matching techniques may work well for image frames with specific characteristics, but they may not work well for image frames with different characteristics.