1. Field of the Invention
The present invention relates to a super high-speed motion estimating method for real time moving image coding. More particularly, the invention relates to a motion estimating method for reducing the complexity of calculating a motion vector by lowering the resolution for motion picture coding and determining a plurality of motion vector candidates under the lowered resolution. The motion vector candidates may be determined by using motion vector correlation of neighborhood blocks. After the candidates are determined, search areas are selected having a plurality of motion vector as centers, and a final motion vector is calculated based on the search areas. Also, an apparatus for performing the method is provided.
2. Description of the Related Art
Motion compensation coding for removing a temporal duplication of a moving image is used in order to obtain a high data compression rate. Such coding plays an important role in International Video Coding Standards such as the MPEG-1, 2, and 4 Standards or the H-263 Standard.
The motion compensation coding predicts an image that is the most similar to a received image based on information of a previous frame of the image. Specifically, motion compensation coding uses motion estimation and conversion codes and obtains a subtraction image by subtracting an estimated image from the received image. The subtraction image is processed and encoded so that it represents a compressed version of the moving image.
A general apparatus that employs moving image coding is shown in FIG. 1.
As shown in the figure, the apparatus includes a frame memory 102, motion estimators 104 and 106, a motion compensator 108, a subtracter 110, a discrete cosine transformer 112, a quantizer 114, an inverse quantizer 116, an inverse discrete cosine transformer 118, an adder 120, a frame delay 122, a forward analysis and coding rate controller 124, a variable length encoder 126, and a buffer 128.
A received image is input in units of a frame and is stored in the frame memory 102. Then, the image frame is output to the first motion estimator 104 is which calculates a first motion vector based on the image, and the motion vector has units of an integer number of pixels. The second motion estimator 106 calculates a second motion vector based using the first motion vector generated by the first motion estimator 104, the image frame received from the frame memory 102, and information of a previous frame of the image received from the frame delay 122. The second motion vector has units of a half-pixel.
The motion compensator 108 inputs the motion vector from the second motion estimator 106 and the information of the previous frame from the frame delay 122, performs a motion compensation operation based on such inputs, and outputs an estimated image frame with respect to a current frame of the image. The subtracter 110 inputs the current image frame from the frame memory 102 and the estimated image frame from the motion compensator 108 and subtracts the estimated image frame from the current image frame to produce a subtracted image frame. As a result, the subtracted image frame is frame in which the temporal duplication of the moving image is removed.
The above motion estimating and compensating processes are performed in units of a 16xc3x9716 block, and such a block is generally referred to as a macro block. After the subtraction image is generated, it output to the discrete cosine transformer 112 and subjected to a discrete cosine transformation. Then, the image is output to the quantizer 114 and quantized. As a result, any remaining spatial duplication of the subtraction image is removed. The motion vectors and the quantized subtraction image are encoded by the variable length encoder 126 and are transferred in a bit stream pattern through the buffer 128.
The quantized subtraction image is also interpolated and restored by the inverse quantizer 116 and the inverse discrete cosine transformer 118. The restored image is added to the estimated image generated by the motion compensator 108 via the adder 120 and the resultant signal is stored in the frame delay 122 and delayed by one frame. The image stored in the frame delay 122 corresponds to the previous image frame of an image that immediately precedes the current image frame output by the frame memory 102. The previous image frame stored in the frame delay 122 is output to the second motion estimator 106 and the motion compensator 108 as described above.
The forward analysis and coding rate controller 124 inputs the current image frame from the frame memory 102 and controls the coding rate of the variable length coder 126.
Currently, a method for estimating and compensating the motion of a moving image in units of a frame and a method for estimating and compensating the motion of a moving image in units of a field are known to those skilled in the art. Therefore, a description of such methods is omitted in the present specification for the sake of brevity.
One conventional method for estimating motion is called a full-scale block matching analysis (xe2x80x9cFSBMAxe2x80x9d). In such analysis, a two-dimensional motion vector of each block is estimated by dividing a current frame into blocks having a uniform size. Then, the respective blocks are compared with all the blocks in a search region of a reference frame according to a given matching standard, and the position of an optimal matching block is determined. A mean absolute difference (xe2x80x9cMADxe2x80x9d) is a relatively simple calculation and is used as a matching standard for determining the optimal matching block in such a conventional block matching method. The MAD is calculated using Equation 1.                               MAD          ⁡                      (                          i              ,              j                        )                          =                              1                          N              2                                ⁢                                    ∑                              k                =                1                            N                        ⁢                          xe2x80x83                        ⁢                                          ∑                                  l                  =                  1                                N                            ⁢                              xe2x80x83                            ⁢                              "LeftBracketingBar"                                                                            f                      t                                        ⁡                                          (                                              k                        ,                        l                                            )                                                        -                                                            f                                              t                        -                        1                                                              ⁡                                          (                                                                        k                          +                          i                                                ,                                                  l                          +                          j                                                                    )                                                                      "RightBracketingBar"                                                                        (        1        )            
wherein, ft(k,l) is the brightness value of a pixel in a position (k, l) of the current frame, and ftxe2x88x921(k+i,l+j) is the brightness value of a pixel in a position offset from the position (k, l) by a distance (i, j).
In such a block matching method, the maximum motion estimation scope is determined by considering the motion of real images when the coding is performed. The FSBMA estimates the motion vector by comparing all the blocks in the motion estimation scope with current blocks and has the highest performance considering an estimated gain. However, an excessive amount of calculation is required to perform the FSBMA. For example, when the maximum movement displacement in a frame is xc2x1p (a pulse/a frame) with respect to a block having a size of Mxc3x97N, the size of the search region is (M+2p)xc3x97(N+2p) in a reference frame. Since the number of candidate blocks to be compared with all of the blocks in the region is (2p+1)2, it becomes more difficult to accurately perform real time moving image encoding as p becomes larger.
Another conventional method for solving such a problem is provided in xe2x80x9cA Fast Hierarchical Motion Vector Estimation Algorithm Using Mean Pyramidxe2x80x9d, K. M. Nam, J. S. Kim, R. H. Park, Y. S. Shim, IEEE Trans. of Circuits and Systems for Video Technology, 1995, 5, (4), pp. 344-351 and xe2x80x9cAccuracy Improvement And Cost Reduction of 3-step Search Region Matching Algorithm for Video Codingxe2x80x9d, IEEE Trans. Circuits and Systems for Video Technology, 1994, 4, (1), pp. 88-90. In the above documents, high-speed hierarchical search methods that use a plurality of candidates and that can replace the FSBMA are described.
Such methods of using a plurality of candidates can solve the problem of a local minimum value which occurs due to a hierarchical search. However, a large amount of calculation is still required in order to achieve a performance comparable to the performance of the FSBMA. Also, since the methods are based on a three-step hierarchical searching method, they are not suitable for estimating a motion in a wide search region.
To solve the above problems, it is an objective of the present invention to provide a motion estimating method which estimates a motion vector at a high speed by reducing the amount of calculation for calculating the motion vector.
It is another objective of the present invention to provide a motion estimating apparatus that performs the motion vector estimating method.
In order to achieve the above an other objectives, a motion estimating method is provided that uses block matching in order to compress a moving image. The method comprises the steps of: (a) inputting an input image frame as a layer 0 frame; (b) reducing a resolution of said layer 0 frame to produce a layer 1 frame; (c) reducing a resolution of said layer 1 frame to produce a layer 2 frame; (d) calculating a first mean absolute difference (xe2x80x9cMADxe2x80x9d) with respect to a layer 2 search region of said layer 2 frame; (e) evaluating said layer 2 search region to identify a first initial search center point based on said first MAD and determining a first layer 1 search region based on said first initial search center point; (f) calculating a second MAD with respect to said first layer 1 search region in said layer 1 frame by using said first initial search center point as a center in said first layer 1 search region; (g) evaluating said first layer 1 search region to identify a first layer 0 search center point based on said second MAD and determining a first layer 0 search region based on said first layer 0 search center point; (h) calculating a third MAD with respect to said first layer 0 search region in said layer 0 frame by using said first layer 0 search center point as a center in said first layer 0 search region; and (i) determining a final position in said first layer 0 search region based on said third MAD and determining a final motion vector based on information corresponding to a distance between said final position and an origin.
Also, an apparatus for performing the method and a computer readable medium containing a program for performing the method is also provided.