When video is streamed over the Internet and played back through a Web browser or media player, the video is delivered in digital form. Digital video is also used when video is delivered through many broadcast services, satellite services and cable television services. Real-time videoconferencing often uses digital video, and digital video is used during video capture with most smartphones, Web cameras and other video capture devices.
Digital video can consume an extremely high amount of bits. The number of bits that is used per second of represented video content is known as the bit rate. Engineers use compression (also called source coding or source encoding) to reduce the bit rate of digital video. Compression decreases the cost of storing and transmitting video information by converting the information into a lower bit rate form. Decompression (also called decoding) reconstructs a version of the original information from the compressed form. A “codec” is an encoder/decoder system.
In general, video compression techniques include “intra-picture” compression and “inter-picture” compression. Whereas intra-picture compression compresses a given picture using information within that picture, inter-picture compression compresses a given picture with reference to a preceding and/or following picture or pictures. Inter-picture compression techniques often use motion estimation and motion compensation to reduce bit rate by exploiting temporal redundancy in a video sequence. In one common technique, an encoder using motion estimation attempts to match a current block of sample values in a current picture with a candidate block of the same size in a search area in another picture, the reference picture. A reference picture is, in general, a picture that contains sample values that may be used for prediction in the encoding and decoding process of other pictures. For a current block, when the video encoder finds an exact or “close enough” match in the search area in the reference picture, the video encoder parameterizes the change in position between the current and candidate blocks as motion data such as a motion vector (“MV”). In general, motion compensation is a process of reconstructing pictures from reference picture(s) using motion data.
When encoding a block of a picture, an encoder often computes the sample-by-sample differences (also called residual values or error values) between the sample values of the block and its prediction (e.g., motion-compensated prediction or intra-picture prediction). The residual values may then be encoded. For the residual values, encoding efficiency depends on the complexity of the residual values and how much loss or distortion is introduced by quantization operations as part of the compression process. In general, a good motion-compensated prediction closely approximates a block, such that the residual values include few significant values, and the residual values can be efficiently encoded. On the other hand, a poor motion-compensated prediction often yields residual values that include many significant values, which are more difficult to encode efficiently.
Quantization and other “lossy” processing during compression of a picture can result in visible lines at boundaries between blocks or sub-blocks of the picture when it is reconstructed. Such “blocking artifacts” might occur, for example, if adjacent blocks in a smoothly changing region of a picture (such as a sky area) are quantized to different average levels. Blocking artifacts can be especially troublesome in pictures that are used as reference pictures for motion compensation processes during encoding and decoding, since they tend to hurt the quality of motion-compensated prediction. To reduce blocking artifacts in a reference picture, an encoder and decoder can use “deblock” filtering to smooth discontinuities at horizontal boundaries and vertical boundaries between blocks and/or sub-blocks in the reference picture. The filtering is “in-loop” in that it occurs inside a motion-compensation loop—the encoder and decoder perform it on reference pictures used later in encoding/decoding. Deblock filtering typically improves quality by providing better motion-compensated prediction and lower bitrate for prediction residuals, thereby increasing coding efficiency. For this reason, in-loop deblock filtering is usually enabled during encoding, in which case a decoder also performs in-loop deblock filtering.
Over the last 25 years, various video codec standards have been adopted, including the ITU-T H.261, H.262 (MPEG-2 or ISO/IEC 13818-2), H.263, H.264 (MPEG-4 AVC or ISO/IEC 14496-10), and H.265 (ISO/IEC 23008-2) standards, the MPEG-1 (ISO/IEC 11172-2) and MPEG-4 Visual (ISO/IEC 14496-2) standards, and the SMPTE 421M standard. A video codec standard typically defines options for the syntax of an encoded video bitstream, detailing parameters in the bitstream when particular features are used in encoding and decoding. In many cases, a video codec standard also provides details about the decoding operations a video decoder should perform to achieve conforming results in decoding. Aside from codec standards, various proprietary codec formats such as VP6, VP8, and VP9 define options for the syntax of an encoded video bitstream and corresponding decoding operations.
Various video codec standards and formats incorporate in-loop deblock filtering. The details of the filtering vary depending on the standard or format, and can be quite complex. Even within a standard or format, the rules of applying deblock filtering across a vertical or horizontal block boundary can vary depending on factors such as content/smoothness, values of motion vectors for blocks/sub-blocks on different sides of the block boundary, block/sub-block size, and coded/not coded status (e.g., whether transform coefficient information is signaled in the bitstream).
Previous approaches to in-loop deblock filtering use boundary-based filtering at horizontal boundaries and vertical boundaries between blocks/sub-blocks. Sample values across certain horizontal boundaries and across certain vertical boundaries between blocks/sub-blocks are selectively filtered, depending on various factors. While such deblock filtering provides good performance in most cases, it can leave noticeable distortion in some scenarios. For example, consider sample values at corner positions of four blocks that meet at a block-boundary intersection, where one of the blocks is coded and the other three blocks are not coded. In this configuration, there can be a large visual difference between sample values at corner positions of two diagonally adjacent blocks, one coded and one not coded. Previous approaches to in-loop deblock filtering that use boundary-based filtering do not adequately compensate for distortion at such corner positions.