1. Field of the Invention
This invention relates to the field of video processing and more particularly to motion estimation used in video compression.
2. Description of the Related Art
Motion estimation is commonly used by video encoders that compress successive frames of digital video data ("video frames"). When video frames are to be transmitted via a communication medium of limited bandwidth, or are to be stored in a storage medium having limited storage capacity, it often is desirable to compress the digital data which represents each frame, so as to reduce the amount of data that needs to be transmitted or stored.
Motion estimation and motion compensation exploit the temporal correlation that often exists between consecutive video frames. For block-based motion estimation, each input frame is divided into blocks and motion estimation is performed on each block relative to blocks in a reference frame (block matching) to generate a motion vector for each block. These motion vectors are then used to assemble a motion compensated frame. Any difference between the motion compensated frame and the input frame is represented by difference data. Since motion vectors and difference data are typically represented with fewer bits than the pixels that comprise the original image, fewer bits need to be transmitted (or stored) in order to represent the input frame. In some conventional video encoders, the motion vectors (and difference data) are further encoded to generate an encoded bitstream for the video sequence. It is preferred that block matching be accurate, as this will tend to minimize the magnitude of the motion vectors and, especially, the amount of difference data.
A reference frame can be the previous motion compensated frame or a "key" frame, which is an actual frame of video not compressed by motion estimation processing. Many conventional video encoders are designed to transmit a key frame at predetermined intervals, e.g. every 10th frame, or at a scene change.
Often, motion vectors are very similar from block to block. In an ideal video encoding system, during slow camera panning of a static scene, all of the motion vectors (except perhaps those for blocks at the edge of an image) point in the direction of the camera's motion and are of equal magnitude. This allows a video coder to use standard techniques such as run length encoding to further encode the motion vectors.
Real video encoding systems, on the other hand, generate noise which may be insignificant and unnoticeable to human vision, but which may be detected and treated as real motion by the video coder during motion estimation processing.
An example of such noise is jitter. Jitter typically is random, oscillatory movement of an entire frame in either or both of the x and y planes or about the z axis (rotation). Jitter can cause a static object in an image to appear as if it has moved, when in fact it has not, and can distort actual image movement. Jitter can have significant adverse effects on motion estimation coding. Jitter has a number of causes, the most obvious of which is the physical movement or vibration of a video camera. Jitter also can arise from mechanical faults or imperfections, such as time-base errors induced by the unsmooth motion of the head drum of a video recorder, and from electrical faults or imperfections, such as supply voltage fluctuations and control system instability in video capture systems.
Motion estimation interprets jitter as legitimate image motion and processes it, resulting in substantially increased data rates from a resulting increase in the magnitude or number of different motion vectors and a decrease in the run lengths of run length encoded motion vectors.
A process and apparatus therefore are needed for characterizing the occurrence of jitter in frames of video, particularly when the video is to be encoded using motion estimation, and for filtering the jitter from the encoded video to prevent it from adversely affecting motion estimation processing.