All video compression artifacts result from quantization, which is the only lossy coding part in a hybrid video coding framework. However, those artifacts can be present in various forms such as, for example, as a blocky artifact, a ringing artifact, an edge distortion, and/or texture corruption. In general, the decoded sequence may be composed of all types of visual artifacts, but with different severances. Among the different types of visual artifacts, blocky artifacts are common in block-based video coding. These artifacts can originate from both the block-based transform stage in residue coding and from the motion compensation stage. Adaptive deblocking filters have been studied in the past and some well-known deblocking filtering methods have been proposed and adopted in various standards (such as those adopted in, for example, the International Organization for Standardization/International Electrotechnical Commission (ISO/IEC) Moving Picture Experts Group-4 (MPEG-4) Part 10 Advanced Video Coding (AVC) standard/International Telecommunication Union, Telecommunication Sector (ITU-T) H.264 recommendation (hereinafter the “MPEG-4 AVC standard”). When designed well, a deblocking filter can improve both objective and subjective video quality. In state of the art video encoders and/or decoders such as, for example, those corresponding to the MPEG-4 AVC Standard, an adaptive in-loop deblocking filter is designed to reduce blocky artifacts, wherein the strength of filtering is controlled by the values of several syntax elements. The basic idea is that if a relatively large absolute difference between samples near a block edge is measured, that difference is likely a blocking artifact and should thus be reduced. However, if the magnitude of that difference is so large that it cannot be explained by the coarseness of the quantization used in the encoding, the edge is more likely to reflect the actual behavior of the source picture and should not be smoothed over. In this way, the blockiness of the content is reduced, while the sharpness of the content is basically unchanged. The deblocking filter is adaptive on several levels. On the slice level, the global filtering strength can be adjusted to the individual characteristics of the video sequence. On the block-edge level, filtering strength is made dependent on the inter/intra prediction decision, motion differences, and the presence of coded residuals in the two neighboring blocks. On macroblock boundaries, special strong filtering is applied to remove “tiling artifacts”. On the sample level, sample values and quantizer-dependent thresholds can turn off filtering for each individual sample.
Deblocking filtering in accordance with the MPEG-4 AVC Standard is well designed to reduce the blocky artifact, but it does not try to correct other artifacts caused by quantization noise. For example, deblocking filtering in accordance with the MPEG-4 AVC Standard leaves edges and textures untouched. Thus, it cannot improve distorted edges or texture. One reason for this lack of capability is that the MPEG-4 AVC Standard deblocking filter applies a smooth image model and the designed filters typically include a bank of low-pass filters. However, images include many singularities, texture, and so forth and, thus, they are not handled correctly by the MPEG-4 AVC Standard deblocking filter.
In order to overcome the limitations of the MPEG-4 AVC Standard deblocking filter, an approach has been recently proposed involving a de-noising type nonlinear in-loop filter. In this proposed approach, a nonlinear de-noising filter adapts to non-stationary image statistics which exploits a sparse image model using an over complete set of linear transforms and hard-thresholding. The nonlinear de-noising filter automatically becomes high-pass, or low-pass, or band-pass, and so forth, depending on the region the filter is operating on. The nonlinear de-noising filter can address all types of quantization noise. This particular de-noising approach basically includes three steps: transform; transform coefficients threshold; and inverse transform. Then several de-noised estimates provided by de-noising with an over complete set of transforms (typically produced by applying de-noising with shifted versions of the same transform) are combined using weighted averaging at every pixel.
Sparsity based de-noising tools could reduce quantization noise over video frames that include locally uniform regions (smooth, high frequency, texture, and so forth) separated by singularities. However, the de-noising tool was designed for additive, independent and identically distributed (i.i.d.) noise removal, while quantization noise has significantly different properties, which can present significant issues in terms of proper distortion reduction and visual de-artifacting. This implies that these techniques may get confused by true edges or false blocky edges. A possibility for a solution is spatio-frequential threshold adaptation, which may be able to correct the decision, but it is not trivial in its implementation. A possible consequence of inadequate threshold selection is that sparse de-noising might result into over-smoothed reconstructed pictures, or a blocky artifact(s) may still be present despite the filtering procedure. In particular, for the smooth picture regions, the signal as well as the blocky artifact added to the signal would probably have sparse representation at the filtering stage if the same transform is used for compression and denoising. So a thresholding operation would probably still keep the artifact. At present, it has been observed that sparsity based de-noising techniques, even though they present a higher distortion reduction in terms of objective measures (e.g., mean squared error (MSE)) than other techniques, they may present important visual artifacts that need to be addressed.
It has been observed that the use of a single de-noising filter is not very efficient or effective in removing coding artifacts. The reason for this is that a general purpose de-noising filter is usually based on a distortion model which does not exactly match the actual scenario to which it is applied. This model does not consider the local structure of blocky artifact. A special purpose de-artif acting filter, on the other hand, is designed to relieve a certain type of artifact. Accordingly, a special purpose de-noising filter is not sufficient to correct the rest of the quantization noises. For example, the in-loop deblocking filter used in the MPEG-4 AVC Standard is a special purpose filter which is not designed to remove the noise/artifacts at pixels away from the boundaries, within textures or to correct the distorted edges.
Turning to FIG. 1, a video encoder capable of performing video encoding in accordance with the MPEG-4 AVC standard is indicated generally by the reference numeral 100.
The video encoder 100 includes a frame ordering buffer 110 having an output in signal communication with a non-inverting input of a combiner 185. An output of the combiner 185 is connected in signal communication with a first input of a transformer and quantizer 125. An output of the transformer and quantizer 125 is connected in signal communication with a first input of an entropy coder 145 and a first input of an inverse transformer and inverse quantizer 150. An output of the entropy coder 145 is connected in signal communication with a first non-inverting input of a combiner 190. An output of the combiner 190 is connected in signal communication with a first input of an output buffer 135.
A first output of an encoder controller 105 is connected in signal communication with a second input of the frame ordering buffer 110, a second input of the inverse transformer and inverse quantizer 150, an input of a picture-type decision module 115, an input of a macroblock-type (MB-type) decision module 120, a second input of an intra prediction module 160, a second input of a deblocking filter 165, a first input of a motion compensator 170, a first input of a motion estimator 175, and a second input of a reference picture buffer 180.
A second output of the encoder controller 105 is connected in signal communication with a first input of a Supplemental Enhancement Information (SEI) inserter 130, a second input of the transformer and quantizer 125, a second input of the entropy coder 145, a second input of the output buffer 135, and an input of the Sequence Parameter Set (SPS) and Picture Parameter Set (PPS) inserter 140.
A first output of the picture-type decision module 115 is connected in signal communication with a third input of a frame ordering buffer 110. A second output of the picture-type decision module 115 is connected in signal communication with a second input of a macroblock-type decision module 120.
An output of the Sequence Parameter Set (SPS) and Picture Parameter Set (PPS) inserter 140 is connected in signal communication with a third non-inverting input of the combiner 190.
An output of the inverse quantizer and inverse transformer 150 is connected in signal communication with a first non-inverting input of a combiner 119. An output of the combiner 119 is connected in signal communication with a first input of the intra prediction module 160 and a first input of the deblocking filter 165. An output of the deblocking filter 165 is connected in signal communication with a first input of a reference picture buffer 180. An output of the reference picture buffer 180 is connected in signal communication with a second input of the motion estimator 175. A first output of the motion estimator 175 is connected in signal communication with a second input of the motion compensator 170. A second output of the motion estimator 175 is connected in signal communication with a third input of the entropy coder 145.
An output of the motion compensator 170 is connected in signal communication with a first input of a switch 197. An output of the intra prediction module 160 is connected in signal communication with a second input of the switch 197. An output of the macroblock-type decision module 120 is connected in signal communication with a third input of the switch 197. The third input of the switch 197 determines whether or not the “data” input of the switch (as compared to the control input, i.e., the third input) is to be provided by the motion compensator 170 or the intra prediction module 160. The output of the switch 197 is connected in signal communication with a second non-inverting input of the combiner 119 and with an inverting input of the combiner 185.
Inputs of the frame ordering buffer 110 and the encoder controller 105 are available as input of the encoder 100, for receiving an input picture 101. Moreover, an input of the Supplemental Enhancement Information (SEI) inserter 130 is available as an input of the encoder 100, for receiving metadata. An output of the output buffer 135 is available as an output of the encoder 100, for outputting a bitstream.
Turning to FIG. 2, a video decoder capable of performing video decoding in accordance with the MPEG-4 AVC standard is indicated generally by the reference numeral 200.
The video decoder 200 includes an input buffer 210 having an output connected in signal communication with a first input of the entropy decoder 245. A first output of the entropy decoder 245 is connected in signal communication with a first input of an inverse transformer and inverse quantizer 250. An output of the inverse transformer and inverse quantizer 250 is connected in signal communication with a second non-inverting input of a combiner 225. An output of the combiner 225 is connected in signal communication with a second input of a deblocking filter 265 and a first input of an intra prediction module 260. A second output of the deblocking filter 265 is connected in signal communication with a first input of a reference picture buffer 280. An output of the reference picture buffer 280 is connected in signal communication with a second input of a motion compensator 270.
A second output of the entropy decoder 245 is connected in signal communication with a third input of the motion compensator 270 and a first input of the deblocking filter 265. A third output of the entropy decoder 245 is connected in signal communication with an input of a decoder controller 205. A first output of the decoder controller 205 is connected in signal communication with a second input of the entropy decoder 245. A second output of the decoder controller 205 is connected in signal communication with a second input of the inverse transformer and inverse quantizer 250. A third output of the decoder controller 205 is connected in signal communication with a third input of the deblocking filter 265. A fourth output of the decoder controller 205 is connected in signal communication with a second input of the intra prediction module 260, with a first input of the motion compensator 270, and with a second input of the reference picture buffer 280.
An output of the motion compensator 270 is connected in signal communication with a first input of a switch 297. An output of the intra prediction module 260 is connected in signal communication with a second input of the switch 297. An output of the switch 297 is connected in signal communication with a first non-inverting input of the combiner 225.
An input of the input buffer 210 is available as an input of the decoder 200, for receiving an input bitstream. A first output of the deblocking filter 265 is available as an output of the decoder 200, for outputting an output picture.