Various prediction methods have been proposed to improve intra prediction. Recently, some non-local approaches to intra prediction have been introduced, such as displaced intra prediction (DIP) and template matching prediction (TMP), which achieve relatively acceptable coding efficiency. Non-local picture data prediction techniques account for those capable of modeling/predicting current data as a function of decoded data already available at the decoder from other regions of the picture being encoded. A similarity between the displaced intra prediction and template matching prediction approaches is that they both search the previously encoded intra regions of the current picture being coded (i.e., they use the current picture as a reference) and find the best prediction according to some coding cost, by performing, for example, region matching and/or auto-regressive template matching. A difference between the displaced intra prediction and template matching prediction approaches is that displaced intra prediction is a forward prediction approach where a displacement vector is explicitly coded in the bitstream, while template matching prediction is a backward prediction approach where a template is used to infer the displacement vector. One problem of such approaches is that by directly measuring the intensity difference as similarity criteria, they cannot handle the mismatch brought by the non-uniform illumination within the picture. The illumination variation can be caused by non-uniform lighting, object geometry change or even material characteristic variation, which is often encountered in natural video sequences. Indeed, two structurally similar picture patches may have significantly different brightness properties due to illumination variation. Non-local prediction approaches cannot always model changes in picture features like contrast and brightness using non-local information. In the presence of non-uniform illumination effects, one can use non-local data as an incomplete information set which on its own is insufficient to efficiently represent the signal to be predicted. In this case, even if exactly the same structural signal pattern is found in the already decoded picture data, the mismatch between the prediction and original signal will generate an important amount of residue that may require a significant amount of bits to code.
MPEG-4 AVC Standard Intra Prediction
The International Organization for Standardization/International Electrotechnical Commission (ISO/IEC) Moving Picture Experts Group-4 (MPEG-4) Part 10 Advanced Video Coding (AVC) standard/International Telecommunication Union, Telecommunication Sector (ITU-T) H.264 Recommendation (hereinafter the “MPEG-4 AVC Standard”) is the first video coding standard which employs spatial directional prediction for intra coding. The MPEG-4 AVC Standard provides a more flexible prediction framework, thus providing more coding efficiency over previous standards where intra prediction was done in the transform domain. In the MPEG-4 AVC Standard, spatial intra prediction is formed using surrounding available samples, which are previously reconstructed samples available at the decoder within the same slice. For luma samples, intra prediction can be performed on a 4×4 block basis (denoted as Intra—4×4), an 8×8 block basis (denoted as Intra—8×8), and a 16×16 macroblock basis (denoted as Intra—16×16). Turning to FIG. 1A, MPEG-4 AVC Standard directional intra prediction with respect to a 4×4 block basis (Intra—4×4) is indicated generally by the reference numeral 100. Prediction directions are generally indicated by the reference numeral 110, image blocks are generally indicated by the reference numeral 120, and a current block is indicated by the reference numeral 130. In addition to luma prediction, a separate chroma prediction is conducted. There are a total of nine prediction modes for Intra—4×4 and lntra—8×8, four modes. for Intra—16×16, and four modes for the chroma component. The encoder typically selects the prediction mode that minimizes the cost for coding the current block. A further intra coding mode, I_PCM, allows the encoder to simply bypass the prediction and transform coding processes. I_PCM allows the encoder to precisely represent the values of the samples and place an absolute limit on the number of bits that may be included in a coded macroblock without constraining the decoded image quality.
Turning to FIG. 2, labeling of prediction samples for the Intra—4×4 mode of the MPEG-4 AVC Standard is indicated generally by the reference numeral 200. FIG. 2 shows the samples (in capital letters A-M) above and to the left of the current blocks which have been previously coded and reconstructed and are therefore available at the encoder and decoder to form the prediction.
Turning to FIGS. 3B-J, Intra—4×4 luma prediction modes of the MPEG-4 AVC Standard are indicated generally by the reference numeral 300. The samples a, b, c, p of the prediction block are calculated based on the samples A-M using the Intra—4×4 luma prediction modes 300. The arrows in FIGS. 3B-J indicate the direction of prediction for each of the Intra—4×4 modes 300. The Intra—4×4 luma prediction modes 300 include modes 0-8, with mode 0 (FIG. 3B, indicated by reference numeral 310) corresponding to a vertical prediction mode, mode 1 (FIG. 3C, indicated by reference numeral 311) corresponding to a horizontal prediction mode, mode 2 (FIG. 3D, indicated by reference numeral 312) corresponding to a DC mode, mode 3 (FIG. 3E, indicated by reference numeral 313) corresponding to a diagonal down-left mode, mode 4 (FIG. 3F, indicated by reference numeral 314) corresponding to a diagonal down-right mode, mode 5 (FIG. 3G, indicated by reference numeral 315) corresponding to a vertical-right mode, mode 6 (FIG. 3H, indicated by reference numeral 316) corresponding to a horizontal-down mode, mode 7 (FIG. 3I, indicated by reference numeral 317) corresponding to a vertical-left mode, and mode 8 (FIG. 3J, indicated by reference numeral 318) corresponding to a horizontal-up mode. FIG. 3A shows the general prediction directions 330 corresponding to each of the Intra—4×4 modes 300.
In modes 3-8, the predicted samples are formed from a weighted average of the prediction samples A-M. Intra—8×8 uses basically the same concepts as 4×4 prediction, but with a prediction block size of 8×8 and with low-pass filtering of the predictors to improve prediction performance. Turning to FIGS. 4A-D, four Intra—16×16 modes corresponding to the MPEG-4 AVC Standard are indicated generally by the reference numeral 400. The four Intra—16×16 modes 400 includes modes 0-3, with mode 0 (FIG. 4A, indicated by reference numeral 411) corresponding to a vertical prediction mode, mode 1 (FIG. 4B, indicated by reference numeral 412) corresponding to a horizontal prediction mode, mode 2 (FIG. 4C, indicated by reference numeral 413) corresponding to a DC prediction mode, and mode 3 (FIG. 4D, indicated by reference numeral 414) corresponding to a plane prediction mode. Each 8×8 chroma component of an intra coded macroblock is predicted from previously encoded chroma samples above and/or to the left and both chroma components use the same prediction mode. The four prediction modes are very similar to the Intra—16×16, except that the numbering of the modes is different. The modes are DC (mode 0), horizontal (mode 1), vertical (mode 2) and plane (mode 3).
Even though intra prediction in the MPEG-4 AVC Standard improves video coding efficiency, it is still not optimal in exploiting the geometry redundancy existing along edges, contours and oriented textures and it is not efficient in coding texture.
Displaced Intra Prediction (DIP)
During the development of the ITU-T H.26L Standard, displaced intra prediction was proposed. The proposal re-uses the concept of variable block size inter-prediction as specified in the MPEG-4 AVC Standard for intra prediction. Turning to FIG. 1B, an example of displaced intra prediction is indicated generally by the reference numeral 150. The displaced intra prediction 150 involves an intra coded region 152, a current block 154, and a candidate block 156. In general, previously encoded intra regions (e.g., intra coded region 152) of a slice can be referenced by displacement vectors (e.g., displacement vector 156) for prediction of the current intra block (e.g., current block 154). The displaced intra prediction 150 is implemented on a macroblock basis. The displacement vectors are encoded differentially using a prediction by the median of the neighboring blocks, in analogy to the inter motion vectors in the MPEG-4 AVC Standard.
Even though displace intra prediction effectively improves coding efficiency when textures or patterns are repeated in intra coded pictures, displace intra prediction is limited by the fact that structurally similar regions may have different illumination properties within the same picture.
Template Matching Prediction (TMP)
Template matching prediction is a concept of texture synthesis to deal with the generation of a continuous texture that resembles a given sample.
Intra prediction using template matching in the context of the MPEG-4 AVC Standard has been proposed. In the proposal, the scheme is integrated as an additional mode for Intra4×4 or Intra8×8 prediction in the MPEG-4 AVC Standard. With template matching prediction, self-similarities of image regions are exploited for prediction. Previously encoded intra regions of a slice can be reused for prediction. The TMP algorithm recursively determines the value of current pixels under prediction by selecting at least one patch (of one or more pixels) of decoded data. Patches are selected according to a matching rule, where patch neighboring pixels are compared to current block neighboring pixels, and patches having the most similar neighboring pixels are selected. Turning to FIG. 1C, an example of template matching intra prediction is indicated generally by the reference numeral 170. The template matching intra prediction 170 involves a candidate neighborhood 172, a candidate patch 174, a template 176, and a target 178. Since the search region and the neighborhood (e.g., candidate neighborhood 172) of the current pixels (e.g., target 178) are known at the encoder and the decoder side, no additional side information has to be transmitted, and identical prediction is achieved on both sides. Here, template matching on a 2×2 luma sample grid is applied to enable a joint prediction for luma and chroma samples in 4:2:0 video sequences.
Displaced intra prediction and template matching prediction both search the previously encoded regions in a current picture. Those encoded regions may not have the same illumination characteristics as the block to be coded, which can degrade the coding performance.
Weighted Prediction for Inter-Prediction
Weighted prediction was proposed to handle temporal illumination variation or fade in/out effects for motion compensation. However, weighted prediction has not been proposed for intra prediction to handle illumination variation inside a picture.