In case of image signals, they have a high correlation between neighboring screens. In order to increase the compression efficiency of image signals, redundant information existing on the time axis must be reduced. More particularly, as motion estimation among neighboring screens of image data requires a lot of computation amount, a lot of studies have been made on algorithm and hardware structures.
There is a conventional method of using a memory, as shown in FIG. 1. This method implements VLSI using a motion estimation algorithm. However, as this method necessitates additional memory, there are problems that a lot of area and a lot of consumption power are required.
In other words, the conventional motion estimation apparatus is consisted of a block 101 for receiving previous images of a search region from an external memory, a block 102 for receiving images of a current reference block from an external memory, a plurality of processing elements (PE) for obtaining a sum of absolute difference between the two input values, a comparator 103 for obtaining a minimum motion vector of output values from the plurality of processing elements, and an address generator 104 for generating addresses for next stage. The plurality of processing elements operate in parallel and each of the processing elements obtains a motion vector in different points, that is, different search regions.
That is, in the prior art, for motion search, a current image data and a previous image data is stored in the buffers (memories), respectively, which are used as an input of the processing elements (PE). At this time, however, as three memories must be used, there are problems that a lot of computation amount and hardware are required.
By adopting a two-step hierarchical search algorithm, the motion estimation module performs a motion search function by performing a ¼ sampling operation for a pixel data in the first step among two-step hierarchical search algorithm, and also performs motion estimation by receiving data of a reference block and data of a search region from the external memory.
In the present invention, a reference block data within a current image from which a motion vector will be obtained and corresponding search region data within reproduced previous image are stored in a reference block and a search region data memory, respectively. A motion vector of two pixels unit is performed using the reference block and the search region data stored in the memory, thus resulting in obtained a motion vector of two pixels unit. At this time, the reference block and the search region data are used by performing 2:1 sampling in a horizontal direction and a vertical direction, respectively and the search range is −7˜+7. The structure of the motion search is consisted of a memory for storing a reference block (8×8) of current images and a memory (24×8) for storing a search region storing reproduced previous images. The structure further includes a processing element (PE) array block for obtaining SAD (sum of absolute difference) among candidate blocks within the search region and a block for obtaining the smallest motion vector among the candidate SADs. If hardware is implemented using the two-step search algorithm among the motion estimation of the present invention, a lot of data bandwidth of the reference memory and a memory having a large size are required. The down sampling scheme and the bandwidth of the reference memory has a structure in which a slice is previously downloaded before a pipeline when it downloads from the external memory. In an actual pipeline operation, it is implemented by the bandwidth of ⅓. Also, as it has independent memories, it can operate even at low frequency without degrading the performance.