1. Field of the Invention
This invention relates to the field of motion estimation, and in particular to block-based motion estimation as applied to video image compression.
2. Description of Related Art
Presently, motion estimation is a key component of many video compression techniques. The purpose of motion estimation is to reduce temporal redundancy between frames of a video sequence. A motion estimation algorithm predicts image data for an image frame using one or more previously coded image frames or future frames. A difference image is computed by taking the arithmetic difference between the original pixel data and the corresponding predicted pixel data. A difference image with large variations indicates little or no temporal redundancy between the image frames. Whereas, a difference image with small variations indicates a high degree of temporal redundancy between the image frames. The difference image represents a reduced temporal redundancy representation of the image frames, which yields better coding efficiency.
One type of motion estimation algorithm is a block-based motion estimation algorithm. Block-based motion estimation algorithms operate on blocks of image data. A block of image data in a current frame is predicted by a block of data from a previous image frame. The motion estimation algorithm outputs a motion vector for the block of image data that specifies the location of the best block match from the previous image frame. In video compression methods, this motion vector information is compressed and transmitted or stored along with the compressed difference data.
International video compression standards such as H.263, MPEG-2, and MPEG-4 allow block-based motion estimation by providing a syntax for specifying motion vectors. These standards do not require specific motion estimation algorithms. Within these compression standards, motion estimation is computed on a base block size of 16×16 pixels denoted as a macroblock. There are allowances to operate on block sizes of 8×8 pixels to estimate motion for smaller image regions.
Motion Estimation is one of the most processor intensive units in a video encoding system. There are a number of existing block-based motion estimation techniques which try to strike a compromise between computational complexity and motion vector efficiency.
Full search motion estimation (FSME) exhaustively compares a block in the current image frame to each pixel position located within a search window of a previously processed frame. The goodness of the block match at each pixel position is determined by measuring its corresponding distortion. A typical distortion measure used by block matching metrics is the sum of absolute difference (SAD) metric:
  SAD  =            ∑              n        =        0                    N        -        1              ⁢                  ⁢                  ∑                  m          =          0                          M          -          1                    ⁢                                            B                          n              ⁢                                                          ⁢              m                        c                    -                      B                          n              ⁢                                                          ⁢              m                        p                                      Where, Bc is the block in the current image frame and Bp is a block in the previous image frame. The indices m and n index the pixels within a block of N rows and M columns. A small SAD value corresponds to a good block match and a large SAD value corresponds to a poor block match. Unfortunately, FSME becomes prohibitive as the search window is increased. Another problem exists for FSME in that use of the SAD metric requires an excessive number of bits needed to encode motion vectors which results in compression inefficiency.
Presently, there are several low complexity motion algorithms. All off these algorithms suffer from either offering poorer quality or from not offering enough reduction in computational complexity. There are also a few motion estimation algorithms proposed that offer somewhat improved quality at relatively reduced complexity.
One possible approach is a zonal based approach. First, a motion vector predictor (PMV) is calculated as a best matching motion vector. Then, a motion vector search following a zonal pattern around the PMV is performed. This is followed by similar zonal search around a zero motion vector. At every step, there is a criterion to end the search if a good enough criterion is obtained. Unfortunately, this approach does not give consistently good results over a wide range of video sequences.
A motion estimation algorithm called PMVFAST is very similar to the above described zonal approach. However, instead of a zonal search pattern, an iterative diamond search pattern is use. Large or small diamond search patterns can be used depending upon certain criteria. Unfortunately, this approach gives a very similar result when compared to the zonal approach.