The essential feature of the up-to-date standards for compression of video images is a motion prediction method. An example of such standards is h.264 standard (MPEG 4 Part 10), which uses variable block size motion prediction. The main idea of motion prediction is tracking of changes between subsequent frames and saving only the changes, rather than full frames. Steps for a typical motion prediction process can be briefly described as:                1. the current frame is partitioned into blocks of pixels;        2. for each block, a search of the best matched block is performed in the reference (previous) frame, the position of the best matched block is represented by a motion vector;        3. a predicted frame is constructed from blocks fetched from the reference frame at the positions pointed by the motion vectors;        4. the predicted frame is subtracted from the current frame generating a prediction error;        5. the result of motion prediction is motion vectors and a prediction error.        
Determination of the motion vectors is the most computationally intensive task. The process of motion vector search is usually referred to as motion estimation. Let's refer to the block of the current frame that is to be predicted as a current block; and the candidate block for prediction to be fetched from reference frame as a reference block. Then, the full search block matching method for motion estimation in most cases is described as looking over all possible reference block positions in some predetermined search area in the reference frame and choosing the position that yields the minimum prediction error. In other words, the current block is an etalon block that is compared with a number of candidate blocks (reference blocks), which yields from the search area by the moving of a sliding window over the search area. Every position represents a vector that is a candidate to be the motion vector; the best matching position with minimal prediction error represents the final motion vector. To reduce the number of search iterations, several search strategies may be used, nevertheless, a full search provides the best prediction result and minimal error. A conventional measuring method of block matching is a Sum of Squared Differences (SSD) or a Sum of Absolute Differences (SAD) of the source and reference block pixels.
In case of high resolution image processing, computational complexity becomes still higher. One way to provide a real time performance is a hardware implementation of the algorithm. For this purpose, the SAD method is employed due to the less consumption of hardware resources. A SAD measure in case of full search motion estimation is described by the following equation:
                                          S            ⁡                          (                              u                ,                v                            )                                =                                                    ∑                                  i                  =                  0                                                  m                  -                  1                                            ⁢                                                ∑                                      j                    =                    0                                                        n                    -                    1                                                  ⁢                                                                                              x                      ⁡                                              (                                                  i                          ,                          j                                                )                                                              -                                          h                      ⁡                                              (                                                                              i                            +                            u                                                    ,                                                      j                            +                            v                                                                          )                                                                                                                                    +                          λ              ⁢                                                          ⁢                              R                ⁡                                  (                                      u                    ,                    v                                    )                                                                    ,                            (        1        )            where x(i, j) are pixels of current block, h—pixels of search area (in the reference frame), m and n are block size, u and v are motion vector components, R(u, v) is a function representing vector coding efficiency, λ is a Lagrange multiplier, S(u, v) is a SAD value for vector (u, v). As evident from the equation (1), the base operation comprises of subtraction, obtaining the absolute value and addition. In order to provide sufficient performance objectives, a hardware implementation needs to be parallel and perform a number of operations at a time. The higher the performance is, the larger search area may be used. Parallel processing consumes lots of resources therefore the main issue of hardware implementation of motion estimation algorithm is a design of low-cost application-specific parallel processor.