1. Field of the Invention
The present invention relates generally to communication technology and, more particularly, to video compression technology.
2. Background of the Invention
Motion compensated video coding is an efficient video compression technique. Motion compensated video coding exploits the temporal redundancy between successive video frames by motion estimation. Selected among different motion estimation techniques, block-based motion estimation was adopted in the MPEG-4 standard (a multimedia network standard of the Moving Pictures Expert Group), and the ITU/T H.263 video coding standard. Block-based motion estimation is efficient and easily implemented for both hardware and software. In block-based video coding, video frames are divided into blocks. Each block is associated with a vector (i.e., a motion vector) to describe the location of the block in the reference frame that provides the best match under some block distortion measure (BDM). The block in the reference frame that provides the best match is used to predict the current block in motion compensated video coding. By encoding the motion vectors and possibly the prediction residues, the video sequence is compressed with high compression efficiency (e.g., because the entropy of the prediction residue plus that of the motion vector is lower than the entropy of the original video frame).
However, block-based motion estimation is a computationally expensive process. A full search (FS) motion estimation, which searches all the possible candidates within a predetermined search window (e.g., a 15xc3x9715 window with zero motion as the center of the window) in the reference frame for the best estimation, can use up to 70% of the video encoding time. Various fast motion estimation techniques have been developed for various practical applications of video encoding. However, the fast motion techniques that have been developed are sub-optimal, because they are susceptible to being trapped in local optima (e.g., local minima) and do not provide as high of quality (i.e., accuracy of motion estimation) as the FS motion estimation.
Accordingly, the present invention provides fast block motion estimation that is computationally efficient, but avoids the risk of being trapped in local optima and approaches the accuracy of the FS motion estimation. In one embodiment, a process for video compression using fast block motion estimation includes: searching for a first minimum block distortion measure (BDM) point in a first search window using a first sampling grid of a second search window, in which the second search window includes a subset of search locations contained in the first search window; and searching for a second minimum BDM point in the first search window using a second sampling grid of a third search window, if the first minimum BDM point is located in a center of the second search window, in which the third search window includes a subset of the search locations contained in the first search window, and the third search window is centered according to a location of the first minimum BDM point. The process can also include searching for a third minimum BDM point in the first search window using a third sampling grid of a fourth search window, if the first minimum BDM point is not located in the center of the second search window, in which the fourth search window includes a subset of the search locations contained in the first search window, and in which the fourth search window is centered according to the location of the first minimum BDM point. The process can also include searching for a fourth minimum BDM point in the first search window using a fourth sampling grid of a fifth search window, if each previously located minimum BDM point of the first search window is located outside of the second search window, in which the fifth search window includes a subset of search locations contained in the first search window, and the fifth search window is centered according to the location of the third minimum BDM point. The process can also include searching for a fifth minimum BDM point in the first search window using a fifth sampling grid of a sixth search window, in which the sixth search window includes a subset of search locations contained in the first search window, and the sixth search window is centered according to an overall minimum BDM point. The search windows and the search locations can overlap, and overlapped search locations can be skipped for computational efficiency. Also, the sampling grids can each include different sub-sampling patterns of search locations of the first search window. In one embodiment, the first search window includes 17xc3x9717 search locations, the second search window includes 7xc3x977 search locations, the third search window includes 3xc3x973 search locations, the fourth search window includes 5xc3x975 search locations, the fifth search window includes 5xc3x975 search locations, and the sixth search window includes 3xc3x973 search locations.
In one embodiment, the first search window includes more than 17xc3x9717 search locations, and the process further includes: searching for an intermediate BDM point in the first search window using the fourth sampling grid of the fifth search window, if the previously located minimum BDM point is located on the fourth sampling grid of a previous search window, in which a current search window is centered according to the location of the previously located minimum BDM point; and repeating the search for the intermediate BDM point for a predefined number of repetitions, if the previously located minimum BDM point is located on the fourth sampling grid of the previous search window. For example, the first search window size equals a video frame size for global motion estimation, a video frame is interpolated, and the search windows operate on an interpolated frame for sub-pixel motion estimation.
Other aspects and advantages of the present invention will become apparent from the following detailed description and accompanying drawings.