Methods for encoding moving pictures or video such as the MPEG1, MPEG2, H.261, and H.263 standards had been developed for efficient transmission and storage. A detailed description of one such encoding method is found in MPEG2 Test Model 5, ISO/IEC JTC1/SC29/WG11/N0400, April 1993, and the disclosure of that document is hereby expressly incorporated herein by reference. In the described encoding method, an input video sequence is organized into a sequence layer, group-of-pictures (GOP), pictures, slices, macroblocks, and finally block layer. Each picture is coded according to its determined picture coding type. The picture coding types used include intra-coded picture (I-picture), predictive-coded picture (P-picture), and bi-directionally predictive-coded picture (B-picture).
Motion estimation/compensation, transform coding, and statistical coding are utilized to efficiently compress the input video sequence. For example in MPEG2 Test Model 5, each picture from the input video sequence is partitioned into rows of smaller and non-overlapping macroblocks of picture elements (pixels). Macroblocks in each row may be grouped into one or more slices. The compression is performed on each macroblock on a row-by-row basis starting from the leftmost macroblock to the rightmost macroblock, and the top row to the bottom row.
In the motion estimation/compensation method, motion vectors are detected for each macroblock in a picture. The coding mode for a macroblock (e.g. intra-coded, forward-predicted, backward-predicted, or interpolated) is decided based on the detected motion vectors and the determined picture coding type. The utilized motion vectors are differentially coded with variable length codes before outputting.
A typical motion vector detection process comprises determining, for each macroblock to be coded, a search window consisting of pixels from a reference picture and matching pixel vales of the macroblocks to blocks of pixel values obtained from the search window. This process is known to be computationally intensive. Particularly, the size of the search window has a direct impact to the computation load.
Many methods of matching the pixel blocks are available, such as an exhaustive, search method which compares every definable block within the search window, a logarithmic search method, a hierarchial search, and various other possible derivations. Depending on application requirements, a search method may be selected based on its performance in terms of accuracy and computation complexity.
To cater for sequences with large object movements between pictures, methods exist to increase the search range without enlarging the search window. These methods provide greater accuracy motion vectors for picture sequences with large movements without a large increase in computation load. One such method is the telescopic search method in which the motion vectors of macroblocks from a previously coded or matched picture are used to generate a new search window for each current macroblock. The telescopic search method comprises the steps of obtaining a motion vector from a co-sited macroblock from a closest coded picture; optional scaling of the obtained motion vector according to the picture distances between the reference picture, the closest coded picture, aid the current picture; and defining the search window based on the centre position of the current macroblock plus an offset defined by the scaled motion vector.
Alternate methods of determining search windows are disclosed in U.S. Pat. Nos. 5,473,379 and 5,657,087, for example. The methods disclosed therein comprise the steps of calculating a global motion vector based on the motion vectors of a previous picture, and offsetting search windows of all macroblocks by the calculated global motion vector. The global motion vector may be determined by the mean or the media function, or by the most common motion vector of the previous picture; it can be further normalized according to the picture distances. The calculated global motion vector may then represent a global translational motion of objects from one picture to the other.
Apparatus implementing the motion estimator may include a search engine for performing the comparison of the current macroblock pixels with candidate pixel blocks from the search window. An example implementation of a suitable search engine is a systolic array processor which calculates and compares sum of absolute difference of the current macroblock with all candidate blocks. A search window cache is coupled with the search engine to sustain the large input data bandwidth requirement of the search engine. The search window cache is updated via a DMA with the new search window for each macroblock from a slower but larger frame memory where the reference picture is stored. A programmable or fixed function controller with necessary RAM or ROM is used to determine the search windows, control DMA update of the cache with the search window, as well as monitor the search engine for resulting motion vectors.
To minimize the bandwidth between the cache and frame memory, the search window cache is designed to maximize the overlapping area of one search window and the next. As a row of macroblocks is processed from the left to right, only the rightmost part of the search window for each macroblock is loaded into the search window cache.
In some instances the amount of picture motion from one frame to another can be very large, particularly when the reference picture is two or more frames separated in sequence from the object picture. However, increasing the size of the search window to better match the large amount of motion would escalate the implementation complexity and power consumption.
The aforementioned telescopic search method expands the possible search range by redefining the search window location for each macroblock. This method faces problems in picture regions with un-correlated motion wherein the search window has to be enlarged to account for the incorrectly defined search window location. In terms of implementation, the expanded possible search range increases the search window cache size requirement and also the bandwidth requirement between the search window cache and the frame memory. This is so because the search window cache has to store data of all possible locations of search window for the next macroblock.
Methods utilizing the global motion vector such as disclosed in U.S. Pat. Nos. 5,473,379 and 5,657,087 may be used to minimize the search window cache size as well as the bandwidth requirement from the frame memory while expanding the actual search range. These methods fix the offset of the search window for all macroblocks in a picture. However, given that the only a single global motion vector is used to offset all of the macroblock search windows, the search range expansion works well only with pictures containing uniform translational motion. Pictures with morning, rotational motion, and shearing effects, for example, are not well dealt with using this technique.
Finally, all of the detected and utilized motion vectors are differentially coded with variable length codes (VLC) to reduce the coding bit rate. Expanding the search range may produce larger motion vectors which require bigger VLC tables to be selected at the picture level to code the motion vectors. In turn, the bit rate for motion vector coding is increased.