1. Field of the Invention
This invention pertains generally to video coding, and more particularly to exploitation of spatial correlation during intra-prediction for block-based image/video compression.
2. Description of Related Art
With the proliferation of visual media, the use of video and image compression has never been more important. It will be appreciated that numerous approaches are used separately or in combination toward increasing video and image compression without loss of quality or introduction of artifacts.
Compression based on inter-picture prediction (shortened simply to “inter-prediction”) attempts to eliminate temporal (time-wise) redundancy between frames of a video. In these predictions motion vectors are used for selecting predictors from prior or future image frames. By contrast, compression based on intra-picture prediction (shortened simply to “intra-prediction”) attempts to eliminate spatial (area-wise) redundancy within the same frame of an image, or video comprising a sequence of images. In video compression standards like H.264 both inter- and intra-predictions are utilized toward reducing or eliminating both forms of redundancy.
Compression standards such as H.264/AVC have included the use of spatial prediction blocks/pixel values, wherein spatial correlation is exploited with characteristics of the pixel neighborhood used to select a predictor. As with any prediction scheme, the use of prediction blocks reduces data rate because only the differences (often referred to as residuals) between the real pixel values and the predicted values need to be transmitted. The use of H.264 is getting increasing attention in view of its ability to encode video with approximately 2-3 times fewer bits than comparable MPEG-2 encoders, and H.264 is about twice as efficient as MPEG-4 encoders. Partly in view of this efficiency, H.264 has recently been welcomed into the MPEG-4 standard as Part 10—Advanced Video Coding. It will be appreciated that the ability to continue lowering the bit rate while maintaining a given image quality is much sought after goal in the field of image/video compression, wherein the development of improved forms of spatial prediction are very desirable and in line with objects of the present invention.
One form of spatial prediction has utilized sample prediction blocks created in response to “attempting” to extrapolate the reconstructed pixels surrounding the target block to be coded. The sample is then subtracted from the current block to yield a residual, which is coded into the output, such as using transformation, quantization and entropy encoding. Aside from other shortcomings, the extrapolation method cannot readily predict complex textures.
Template matching techniques arose to overcome the shortcomings of the sample interpolation approach and other problematic techniques. Originally put forth for texture synthesis purposes, texture matching found its way into block-based image compression because of its predictive ability. In template matching, the value of each pixel/block in the output frame is determined by comparing its spatial neighborhood with all previously decoded neighborhoods in the frame or more preferably a search region thereof. Put another way, an area surrounding the current block which has already been coded, such as both above and to the left side of the current block, as the scan is traditionally represented, is used as a template to which other templates in the neighborhood are to be matched. The presumption of the approach being that if the pixels bordering a first block (a first template) match the pixels bordering a second block (a second template), then there is a good chance their associated blocks, first block and second block, will be reasonably similar.
It should be noted that the encoder and decoder in a template matching system are configured to perform much of the same template matching process, wherein less data needs to be sent to the decoder about the area of prediction. Since the area of the template has already been coded (both for encoder and decoder) it is known to both the encoder and decoder, thus they have a common base upon which to select the prediction. As a consequence only the residual needs to be computed in the encoder and applied within the decoder.
In one intra-prediction form of template matching, a single candidate template is found, which most closely matches the template from the current block, wherein the associated block is selected for the prediction. The decoder similarly finds this same close-match template, uses the associated block and applies the residual data obtained from the encoder/bit stream. However, this single candidate approach has the drawback in that although the pixels surrounding a block may fully match, the block itself may be substantially different than the target block currently being predicted.
In another form of intra-prediction template matching, a decoder is configured with a form of averaging of candidate matches in predicting the current block. Averaging is generally given in a form:
      σ    avg    2    =                    E        ⁡                  [                      X            avg            2                    ]                    -              σ        avg        2              =                  1        N            ⁢              σ        2            
Without going into details, the above shows the averaging of N candidate blocks toward providing a better predictor than any of the blocks selected at random. It will be noted that the averaging is intended to reduce prediction error (σ) by N.
However, it should be recognized that an averaging step is required across the various templates and thus overhead is increased. In addition, it should be noted that averaging the errors is a form of compromise toward reducing maximum error. This is beneficial in considering the single candidate approach above, because although one candidate may have a huge disparity with the target block (e.g., a maximum error), the averaging with the other matches would reduce the maximum error. Unfortunately, the technique does not engender achieving predictions with minimum error; for example in cases where at least one of the templates provides a very close match with the actual current block being coded. Therefore, the use of template matching as conventionally embodied has a number of shortcomings.
Accordingly, a need exists for a system and method of video intra-prediction which has low overhead and is easy to implement in both the encoder and decoder. These needs and others are met within the present invention, which overcomes the deficiencies of previously developed intra-prediction system and methods.