1. Field of the Invention
The present invention relates to motion estimation, and more particularly, to motion estimation implementing matching-pixels sub-sampling technique in video compression.
2. Description of the Related Art
Motion Estimation (ME) is a technique commonly used in video compression to reduce temporal redundancy between video frames. By replacing similarly repeating video segments with motion vectors, memory requirements for encoding video files are accordingly reduced. This allows for a size reduction of large video files, while in most cases not drastically sacrificing or degrading video quality.
FIG. 1 is an illustration describing frame motion estimation (frame ME) according to the prior art. As previously stated, the goal of motion estimation is to reduce temporal redundancy between video frames, which includes a current frame 120 and a reference frame 110. The current frame 120 is divided into macroblocks (MBs) to be used as video segments for comparison with various candidate MBs in the reference frame 110. The size of the MBs can vary according to the desired tradeoff in computational efficiency and accuracy, but are typically around 16×16 pixels in many algorithms. Once divided, every macroblock (MB) in the current frame 120 is compared to MBs in the reference frame 110 using a predetermined error measure to determine a best matching candidate MB. A vector denoting the matching MBs between frames (also called a motion vector) is then denoted, and used to replace the current MB in the current frame 120 in the compression algorithm. Using the illustration in FIG. 1, the baseball macroblock 125 in the current frame 120 can be reasonably matched to the baseball macroblock 115 in the reference frame 110. The motion vector 130 illustrates the displacement between the matching macroblocks in the reference frame 110 to the current frame 120.
Because storing motion vectors require less memory than storing actual macroblock data, memory consumption can be drastically reduced when ME is used in a compression algorithm. During reconstruction, macroblocks in the reference frame indicated by motion vectors are used to predict the current frame. This technique is known as motion compensated prediction or motion compensation. During motion compensation, the matching macroblock in the reference frame that is referenced to by the motion vector, is copied into the reconstructed frame. Continuing with the example shown in FIG. 1, in motion compensation, baseball macroblock 125, which has been omitted from storage, is replaced in the current frame 120 with baseball macroblock 115 of the reference frame 110 according to motion vector 130.
The resulting video quality from compressed video using motion estimation can vary according to the algorithm used to find motion vectors. An inaccurate motion vector leads to a dissimilar prediction which in turn results to poor video quality. An error measure is used in order to quantify the degree of similarity between macroblocks during block matching for motion estimation. The sum of absolute differences (SAD) and the sum of squared error (SSE) are some of the commonly used error measures applied to block matching.
While the above frame motion estimation technique is usually applied to progressive video sources, where full frames of video images are continuously shown, the ME operation can also be applied to interlaced video. FIG. 2 illustrates the composition of an interlaced video frame 200 according to the prior art. In interlaced video, an interlaced frame 200 comprises an even field 210 (consisting of the even horizontal lines), and an odd field 220 (comprising the odd horizontal lines). When performing motion estimation for interlaced video however, instead of comparing entire macroblocks, one field of a macroblock in a current frame is compared to a field of a macroblock in a reference frame. This is alternatively known as field motion estimation (field ME). Both frame ME and field ME can however be applied to interlaced video with varying results. As it turns out, when the video is static having limited motion in this video sequence, frame ME produces better results. However, when the video is more dynamic with greater motion in the video sequence, field ME produces better matching results. As field ME and interlaced video is well known to those skilled in the art, further discussion is omitted for brevity.
When determining matching macroblocks, the matching pixels may be sampled to reduce computational complexity. Certain pixels of both current MB and candidate MB are selected to compute a characterizing value (usually an SAD value or MSE), and the characterizing values of the candidate MBs are compared. FIG. 3 illustrates an example where ME is done by matching two macroblock pairs of 32*16 pixels, and a four-queen sub-sampling pattern 310 is used. FIG. 3 illustrates an example to sample a 16*32 macroblock pair. In this case, both current MB pair and candidate MB pair are of 16*32 pixels. A four-queen pattern 340 selecting 4 pixels out of every 4*4 block is used repeatedly to form a 16*32 sub-sampling pattern 310. The sub-sampling pattern 310 is applied to both current and candidate MB pairs. Only those pixels selected are included in the calculation of error measures.
Although the four-queen sub-sampling pattern for frame ME appears evenly distributed, when applied to field ME, the sub-sampling patterns 320 and 330 became uneven. The lack of a uniform sampling distribution for field ME may therefore provide under-optimized matching results, which reduces the quality of compressed video.