Numerous approaches have been proposed to provide more accurate motion compensation by providing different predictions for different regions in a macroblock. Examples include techniques used in the International Organization for Standardization/International Electrotechnical Commission (ISO/IEC) Moving Picture Experts Group-4 (MPEG-4) Part 10 Advanced Video Coding (AVC) Standard/International Telecommunication Union, Telecommunication Sector (ITU-T) H.264 Recommendation (hereinafter the “MPEG-4 AVC Standard”) or the hierarchical quadtree (QT) approach.
In these approaches, a macroblock is split into smaller blocks and a search is performed to find the best match for each block. As the number of blocks in a macroblock increases, overhead increases while distortion between the original macroblock and the matching macroblock decreases. Therefore, there is a minimum rate-distortion point and the best block mode is typically decided by a Lagrangian tool.
In order to increase the matching capability using a square or rectangular block shape in the quadtree approach, a geometry based approach (GEO) has been proposed. In the geometry based approach, a block is split into two smaller blocks called wedges by a line described by a slope and translation parameters. The best parameters and matching wedges are searched together. Although the geometry based approach captures object boundaries better than the quadtree approach, the geometry based approach is still limited to a straight line segmentation or partition.
An object based motion segmentation method has been proposed to solve the occlusion problem. In accordance with the object based motion segmentation method, motion vectors from neighboring blocks are copied after block segmentation in order to capture different motions in a block. To avoid transmitting segmentation information, previously encoded frames at time (t−1) and time (t−2) are used to estimate segmentation for the current frame at time (t).
Motion-compensated predictive coding (MCPC) is the technique that has been found to be the most successful for exploiting inter-frame correlations. In a motion-compensated predictive coding scheme, the difference between the original input frame and the prediction from decoded frames is coded. This difference frame is usually known as the prediction error frame.
The purpose of employing predictions is to reduce the energy of the prediction error frames so that the prediction error frames have lower entropy after transformation and can therefore be coded with a lower bit rate. One of the major design challenges in video compression is how to enhance the quality of prediction or, in other words, to have predictors that are as close to the current signal as possible.
In current block based motion compensation or disparity compensation, fixed size rectangular blocks limit the capability to find better predictors for the original block context, which can be any arbitrary shape. Block based search approaches find a match for a dominant part within a block such that occluded objects are not well predicted. Considering the accuracy of prediction, an optimal method is to segment the original block into different objects and search for the match for each segment. However, this requires the encoder to transmit segment information to the decoder and this extra overhead overwhelms the benefit from the enhanced predictor.