Video data compression removes redundant data that can be easily restored to reduce storage space or transmission bandwidth. Typically, video data is in a series of frames in which much of the image is substantially similar as objects in the video frames move in the field. By tracking the moving objects, the amount of data recorded can be reduced to the changes in the objects from one frame to another while the rest of the data can be substituted with data recorded for a previous frame.
An existing gradient-descent method to estimate how a block of pixels in a video frame will move in a subsequent frame is commonly referred to as full pixel diamond search. A full-pixel diamond search performs a motion search using two levels as shown in FIGS. 1A and 1B. Referring to FIG. 1A, measurements of the pixels are collected at the 9 search points 103 within a diamond search pattern in the first level of the motion search. The measurements at the motion search points are compared with each other to determine how close each search point of a block of pixels in a reference frame is to the block of pixels in the current frame. The reference frame may be a prior frame in the video data. If the closest point is one of the eight outer points within the diamond search pattern, the search pattern is shifted to a new position centered at that point. In such a case, the new search pattern has 4 search points in common with the old search pattern and 5 new search points. Therefore, 5 new measurements are performed when the diamond search pattern is shifted to the right by 2 grid lines. No measurement is necessary for the search points common to both search patterns because the data from previous measurements can be reused.
If the best measurement is at the center point, the next level search is performed as shown in FIG. 1B. At this level, four new motion search points 110 are measured and the data of the point 105 at the center from previous measurements is reused. The motion search point with the smallest measurement is the full-pixel search position.
The diamond search is a fast algorithm because only 5 or more measurements are required for shifting the pattern by 2 grids. However, the irregular shape of the diamond search pattern complicates operations on the measurement data stored in a register file. Some of the problems of the diamond search include the difficulties in keeping track of the address offsets of the data, the inefficient usage of storage space, and possible miss of true minima due to its sparse sampling pattern.
Due to its massive amount of computation requirement, motion search is usually performed by a parallel processor. One such parallel processor uses operations of Single Instruction Multiple Data (SIMD). For example, one measurement of motion search is the Sum of Absolute Different (SAD), a measure of the L1-norm of the difference between the block in the current frame and a block in the reference frame. Another measurement of motion search is the Sum of Square Different (SSD), a measure of the square of L2-norm (Euclidean norm) of the difference between the block in the current frame and a block in the reference frame. The SAD or SSD measurement on multiple data entries may be performed by a SIMD operation. A typical parallel processor has a large register file in order to reduce the number of data access to the memory hierarchy. Data of the diamond search within a search region is usually stored in the register file. The parallel processor may access multiple data entries in the register file in a parallel operation, such as, for example, a SAD operation to compute the sums of absolute difference of data entries in several consecutive columns. Since the rows in the diamond shaped region in the register file have different numbers of columns, different numbers of columns are accessed to compute the SAD of entries in the rows. Consequently, the parallel processor has to keep track of different address offsets for various rows and columns, and thus, complicating the performance of parallel operations.