Sub-pel motion compensation is used widely in current video encoders and decoders. For example, in the International Organization for Standardization/International Electrotechnical Commission (ISO/IEC) Moving Picture Experts Group-4 (MPEG-4) Part 10 Advanced Video Coding (AVC) Standard/International Telecommunication Union, Telecommunication Sector (ITU-T) H.264 Recommendation (hereinafter the “MPEG-4 AVC Standard”), motion compensation up to a quarter-pel precision is used. Such a scheme is referred to herein as the “first prior art approach”. Turning to FIG. 1, the upsampling of a frame by a factor of 4 (for ¼-pel vectors) in accordance with the MPEG-4 AVC Standard is indicated generally by the reference numeral 100. The upsampling involves first applying a 6-tap Wiener filter for half-pel generation and then applying a bilinear filter for quarter-pel generation.
A second prior art approach proposed by the Video Coding Experts Group (VCEG) involves using ⅛-pel compensation to further improve coding efficiency for sequences with aliasing artifacts. In addition to using fixed interpolation filters, in order to better handle aliasing, quantization and motion estimation errors, camera noise, and so forth, adaptive interpolation schemes has been considered. An adaptive interpolation scheme estimates the interpolation filter coefficients on the fly for each sub-pel position to increase the coding efficiency. Taking all the complicated interpolation schemes into consideration, it does not make sense to interpolate all the reference frames and store such interpolated frames with sub-pel precision at the decoder, since only a few sub-pel positions have to be interpolated. Such a scheme will likely result in high memory consumption and high computation complexity at the decoder. One way of performing motion compensation on the fly at the decoder is as performed by the Key Technology Area (KTA) software improvements over the MPEG-4 AVC Standard.
Template matching prediction (TMP) is a technique used to gain coding efficiency for both inter and intra prediction by avoiding transmission of motion/displaced information (motion vectors, reference index, and displaced vectors). Template matching prediction is based on the assumption that there exist a lot of repetitive patterns in video pictures. Hence, template matching searches the similar patterns through the decoded video pictures by matching the neighboring pixels. The final prediction is, in general, the average of several best matches. Template matching can be used in both inter and intra predictions. However, the disadvantage of template matching prediction is that the same search has to be performed at both the encoder and decoder. Thus, template matching prediction can significantly increase the decoder complexity.
Template Matching Prediction in Inter Prediction
Template matching prediction in inter prediction is one way to predict target pixels without sending motion vectors. Given a target block of a frame, a target pixel in the block is determined by finding an optimum pixel from a set of reference samples, where the adjacent pixels of the optimum pixels have the highest correlation with those of the target pixels. Those adjacent pixels of the target pixels are called the template. In the prior art, the template is usually taken from the reconstructed surrounding pixels of the target pixels. Turning to FIG. 2, an example of a template matching prediction scheme for inter prediction is indicated generally by the reference numeral 200. The template matching prediction scheme 200 involves a reconstructed reference frame 210 having a search region 211, a prediction 212 within the search region 211, and a neighborhood 213 with respect to the prediction 212. The template matching prediction scheme 200 also involves a current frame 250 having a target block 251, a template 252 with respect to the target block 251, and a reconstructed region 253. In the case of inter-prediction, the template matching process can be seen as a motion vector search at the decoder side. Here, template matching is performed very similar to traditional motion estimation techniques. Namely, motion vectors are evaluated by calculating a cost function for accordingly displaced template-shaped regions in the reference frames. The best motion vector for the template is then used to predict the target area. Only those areas of the image where a reconstruction or at least a prediction signal already exists are accessed for the search. Thus, the decoder is able to execute the template matching process and predict the target area without additional side information.
Template matching can predict pixels in a target block without transmission of motion vectors. It is expected that the prediction performance of template matching prediction is comparable to that of the traditional block matching scheme if the correlation between a target block and its template is high. In the prior art, the template is taken from the reconstructed spatial neighboring pixels of the target pixels. The neighboring pixels sometimes have low correlations with the target pixels. Thus, the performance of template matching prediction can be lower than the traditional block matching scheme.
Template Matching Prediction in Intra Prediction
In intra prediction, template matching is one of the available non-local prediction approaches, since the prediction could be generated by the pixels far away from the target block. In intra template matching, the template definition is similar to that in inter template matching. However, one difference is that the search range is limited to the decoded part of the current picture. Turning to FIG. 3, an example of a template matching prediction scheme for intra prediction is indicated generally by the reference numeral 300. The template matching prediction scheme 300 involves a decoded part 310 of a picture 377. The decoded part 310 of the picture 377 has a search region 311, a candidate prediction 312 within the search region 311, and a neighborhood 313 with respect to the candidate prediction 312. The template matching prediction scheme 300 also involves an un-decoded part 320 of the picture 377. The un-decoded part 320 of the picture 377 has a target block 321, a template 322 with respect to the target block 321. For simplicity, the following description is based on intra template matching. However, it is appreciated by one of ordinary skill in this and related arts that the inter template counterpart can be readily extended.
A problem associated with template matching prediction at the decoder is that because template matching needs to perform searching at the decoder and performs such searching without requiring any constraints, there is a need to perform sub-pel interpolation for all of the reference frames and to store such interpolated frames with sub-pel precision at the decoder, despite the fact that only a few sub-pel positions have to be interpolated. This may add considerably complexity including, for example, memory and computation complexity at the decoder.