1. Field of the Invention
The invention relates to a method and related apparatus for motion estimation in a video compression system, and more particularly, to a method and related apparatus for motion estimation using a cost function.
2. Description of the Prior Art
As multimedia technology develops, more and more standards related to video compression have been introduced. For instance, various versions of MPEG are standards for digital video compression, and ITU H.261, H.263, ISO 10918 are other examples.
MPEG defines a standard for digital video compression. A motion picture is composed of a series of pictures, and each picture can be regarded as a two-dimensional array composed of a plurality of pixels, which is called a frame of the motion picture. MPEG standard defines four types of different pictures: I picture, which is encoded without referring to any other pictures; P picture, which is encoded through motion estimation referring to a previous I picture or P picture; B picture, which is encoded through motion estimation referring to a following I picture or P picture; and D picture, which is used in fast forward search mode.
Video compression systems complying with the standards mentioned above utilize motion estimation technology based on blocks or macroblocks in order to reduce the temporal redundancy. During motion estimation, for a current encoding block in a current picture, the video compression system will find a best matching block, which is the most similar to the current encoding block, from a target picture. In this case, for the current encoding block, the video compression system can store (or transmit) the motion vector and the residual calculated to represent data included in the current encoding block (wherein the residual represents a pixel value difference between the current encoding block and the best matching block).
According to the prior art, when the video compression system searches for the best matching block from a search range, a cost function called “sum of absolute difference” is used, which is obtained as follows:
      SAD    ⁡          (              x        ,        y            )        =            ∑              i        =        i0            i1        ⁢                  ⁢                  ∑                  j          =          j0                j1            ⁢                                            C                          i              ,              j                                -                      P                                          i                +                x                            ,                              j                +                y                                                                
(x, y) is a candidate motion vector in the search range, (i1−i0)*(j1−j0) is the size of the current encoding block, Cij is a pixel in the current encoding block, and Pi+x,j+y is a pixel in the search range of the target picture.
The conventional video compression system finds a candidate motion vector (x, y), which minimizes the cost function, to be the optimal motion vector (x1, y1) of the current encoding block. Such a method is for finding the best matching block having the smallest residual so that the residual can be better compressed. However, the found optimal motion vector (x1, y1) may not result in better compression; thus U.S. Pat. No. 5,847,776 discloses another cost function that considers not only the sum of absolute difference but also the volume of the motion vector during the searching for the optimal motion vector so that a balance can be kept between the found optimal motion vector and a residual corresponding to it.
However, most video compression systems utilize a discrete cosine transform (DCT) algorithm to transform the residual in a spatial domain into a frequency domain during the compressing of the residual. Then the video compression system utilizes a corresponding quantization matrix and a quantization step Qp, which changes according to a bit rate selected by the system, to quantize the residual in the frequency domain. Since the quantized matrix is a two-dimensional matrix, the system further utilizes zig-zag scan or alternate scan to scan the quantized two-dimensional data into one-dimensional data. Finally, the video compression system operates variable length coding.
During variable length coding, the smaller the frequency distribution range of the residual in the frequency domain is, the shorter the code length of the encoded residual is (i.e. the better compressed the residual is). However, neither the prior art nor the method disclosed in U.S. Pat. No. 5,847,776 can find the best matching block with the residual that has the smallest frequency distribution range. Even in the case that the found best matching block results in a residual in the spatial domain having the smallest sum of absolute difference, after it has been operated on by DCT, quantization process, zig-zag scan (or other scan methods), variable length coding, the residual may not necessarily have the shortest code length, meaning that the optimal compression cannot be achieved. This is a main problem in the prior art.