Engineers use compression (also called source coding or source encoding) to reduce the bit rate of digital video. Compression decreases the cost of storing and transmitting video information by converting the information into a lower bit rate form. Decompression (also called decoding) reconstructs a version of the original information from the compressed form. A “codec” is an encoder/decoder system.
Over the last two decades, various video codec standards have been adopted, including the ITU-T H.261, H.262 (MPEG-2 or ISO/IEC 13818-2). H.263 and H.264 (MPEG-4 AVC or ISO/IEC 14496-10) standards, the MPEG-1 (ISO/IEC 11172-2) and MPEG-4 Visual (ISO/IEC 14496-2) standards, and the SMPTE 421M (VC-1) standard. More recently, the HEVC standard (ITU-T H.265 or ISO/IEC 23008-2) has been approved. Extensions to the HEVC standard (e.g., for scalable video coding/decoding, for coding/decoding of video with higher fidelity in terms of sample bit depth or chroma sampling rate, or for multi-view coding/decoding) are currently under development. A video codec standard typically defines options for the syntax of an encoded video bitstream, detailing parameters in the bitstream when particular features are used in encoding and decoding. In many cases, a video codec standard also provides details about the decoding operations a decoder should perform to achieve conforming results in decoding. Aside from codec standards, various proprietary codec formats define other options for the syntax of an encoded video bitstream and corresponding decoding operations.
In general, video compression techniques include “intra-picture” compression and “inter-picture” compression. Intra-picture compression techniques compress individual pictures, and inter-picture compression techniques compress pictures with reference to a preceding and/or following picture (often called a reference or anchor picture) or pictures.
Inter-picture compression techniques often use motion estimation and motion compensation to reduce bit rate by exploiting temporal redundancy in a video sequence. Motion estimation is a process for estimating motion between pictures. In one common technique, an encoder using motion estimation attempts to match a current block of sample values in a current picture with a candidate block of the same size in a search area in another picture, the reference picture. When the encoder finds an exact or “close enough” match in the search area in the reference picture, the encoder parameterizes the change in position between the current and candidate blocks as motion data (such as a motion vector (“MV”)). An MV is conventionally a two-dimensional value, having a horizontal MV component that indicates left or right spatial displacement and a vertical MV component that indicates up or down spatial displacement. In general, motion compensation is a process of reconstructing pictures from reference picture(s) using motion data.
An MV can indicate a spatial displacement in terms of an integer number of sample grid positions starting from a co-located position in a reference picture for a current block. For example, for a current block at position (32, 16) in a current picture, the MV (−3.1) indicates position (29, 17) in the reference picture. Or, an MV can indicate a spatial displacement in terms of a fractional number of sample grid positions from a co-located position in a reference picture for a current block. For example, for a current block at position (32, 16) in a current picture, the MV (−3.5, 1.25) indicates position (28.5, 17.25) in the reference picture. To determine sample values at fractional offsets in the reference picture, the encoder typically interpolates between sample values at integer-sample positions. Such interpolation can be computationally intensive. During motion compensation, a decoder also performs the interpolation as needed to compute sample values at fractional offsets in reference pictures.
Different video codec standards and formats have used MVs with different MV precisions. For integer-sample MV precision, an MV component indicates an integer number of sample grid positions for spatial displacement. For a fractional-sample MV precision such as ½-sample MV precision or ¼-sample MV precision, an MV component can indicate an integer number of sample grid positions or fractional number of sample grid positions for spatial displacement. For example, if the MV precision is ¼-sample MV precision, an MV component can indicate a spatial displacement of 0 samples, 0.25 samples, 0.5 samples, 0.75 samples, 1.0 samples, 1.25 samples, and so on. Some video codec standards and formats support switching of MV precision during encoding. Encoder-side decisions about which MV precision to use are not made effectively, however, in certain encoding scenarios.