Video coding standards employ block-based transforms (for example, the ubiquitous discrete cosine transform, or DCT) and motion compensation to achieve compression efficiency. Coarse quantization of the transform coefficients and the use of different reference locations or different reference pictures by neighboring blocks in motion-compensated prediction can give rise to visually disturbing artifacts such as distortion around edges, textures or block discontinuities. In the state-of-the-art International Organization for Standardization/International Electrotechnical Commission (ISO/IEC) Moving Picture Experts Group-4 (MPEG-4) Part 10 Advanced Video Coding (AVC) standard/International Telecommunication Union, Telecommunication Sector (ITU-T) H.264 Recommendation (hereinafter the “MPEG-4 AVC Standard”), an adaptive de-blocking filter is introduced to combat the artifacts arising along block boundaries.
More general de-artifacting approaches have been proposed to combat artifacts not only on block discontinuities but also around image singularities (e.g., edges and/or textures), wherever these may appear. In a first prior art approach, in order to maximize performance, the threshold for de-artifacting filters must consider local encoding conditions imposed by the video coding procedure. For instance, within a single frame, the MPEG-4 AVC Standard offers various prediction modes (intra, inter, skip, and so forth) each of which is subject to distinct quantization noise statistics and corresponding filtering demands. Thus, in the first prior art approach, the threshold is adapted based on the coding modes and quantization parameter (QP). However, the threshold in the first prior art approach does not take the video content itself into account.
Deblocking Filter in the MPEG-4 AVC Standard
Within the state-of-the-art MPEG-4 AVC Standard, an in-loop deblocking filter has been adopted. The filter acts to attenuate artifacts arising along block boundaries. Such artifacts are caused by coarse quantization of the transform (DCT) coefficients as well as motion compensated prediction. By adaptively applying low-pass filters to the block edges, the deblocking filter can improve both subjective and objective video quality. The filter operates by performing an analysis of the samples around a block edge and adapts filtering strength to attenuate small intensity differences attributable to blocky artifacts while preserving the generally larger intensity differences pertaining to the actual image content. Several block coding modes and conditions also serve to indicate the strength with which the filters are applied. These include inter/intra prediction decisions, the presence of coded residuals and motion differences between adjacent blocks. Besides adaptability on the block-level, the deblocking filter is also adaptive at the slice-level and the sample-level. On the slice level, filtering strength can be adjusted to the individual characteristics of the video sequence. On the sample level, filtering can be turned off at each individual sample depending on sample value and quantizer-based thresholds.
The blocky artifacts removed by the MPEG-4 AVC Standard deblocking filter are not the only artifacts that present in compressed video. Coarse quantization is also responsible for other artifacts such as ringing, edge distortion, or texture corruption. The deblocking filter cannot reduce artifacts caused by quantization errors which appear inside a block. Moreover, the low-pass filtering techniques employed in deblocking assume a smooth image model and are not suited for processing image singularities such as edges or textures.
Sparsity-Based De-Artifacting
Inspired by the sparsity-based de-noising techniques, a nonlinear in-loop filter has been proposed for compression de-artifacting as noted above with respect to the first prior art approach. The first prior art approach uses a set of de-noised estimates provided by an over-complete set of transforms. The implementation of the first prior art approach generates an over-complete set of transforms by using all possible translations Hi of a given two dimensional (2D) orthonormal transform H, such as wavelets or DCT. Thus, given an image I, a series of different transformed versions Yi of the image I is created by applying the various transforms Hi. Each transformed version Yi is then subject to a de-noising procedure, typically involving a thresholding operation, producing the series of Y′i. The transformed and thresholded coefficients Y′i are then inverse transformed back into the spatial domain, giving rise to the de-noised estimates I′i. In over-complete settings, it is expected that some of the de-noised estimates will provide better performance than others and that the final filtered version I′ will benefit from a combination via averaging of such de-noised estimates. The first prior art approach de-noising filter proposes the weighted averaging of de-noised estimates I′i where the weights are optimized to emphasize the best de-noised estimates.
For de-artifacting work, a choice of filtering parameters, such as, for example, threshold, is of great importance. The applied threshold plays a crucial part in controlling the de-noising capacity of the filter as well as in computing the averaging weights used in emphasizing the better de-noising estimates. Inadequate threshold selection may result in over-smoothed reconstructed pictures or may allow the persistence of artifacts. In the first prior art approach, selected thresholds per pixel class based on quantization parameter (QP) and coding mode information are encoded and transmitted as side information to the decoder. The threshold does not adapt based on the video content.
Video content varies both spatially and temporally. The noise or artifacts level of a video sequence under the same quantization parameter (QP) or coding mode can be very different, which calls for different filtering parameters.