Video data is generally processed and transferred in the form of bit streams. Typical video compression encoders gain much of their compression efficiency by forming a reference picture prediction of a picture or macroblock to be encoded, and encoding the difference between the current picture and the prediction. The more closely that the prediction is correlated with the current picture, the fewer the number of bits that are needed to compress that picture, thereby increasing the efficiency of the process. Thus, it is desirable for the best possible reference picture prediction to be formed.
Interblock (“inter”) and intrablock (“intra”) coding are commonly used in video compression standards. Generally, an encoder makes an inter/intra coding decision for each macroblock based on coding efficiency and subjective quality considerations. Some partitions (e.g., 16×8, 8×16 or 8×8 sub-blocks) of a 16×16 macroblock, for example, might be more efficiently coded using intra coding while other partitions of the same macroblock might be more efficiently coded using inter coding.
Thus, each individual macroblock was either coded as Intra, i.e., using only spatial correlation, or coded as Inter, i.e., using temporal correlation from previously coded frames. Inter coding is typically used for macroblocks that are well predicted from previous frames, and intra coding is generally used for macroblocks that are not well predicted from previous frames, or for macroblocks with low spatial activity.
The JVT video compression standard, which is also known as H.264 and MPEG AVC, uses tree-structured hierarchical macroblock partitions. Inter-coded 16×16 pixel macroblocks may be broken into macroblock partitions, of sizes 16×8, 8×16, or 8×8. 8×8 macroblock partitions are also known as sub-macroblocks. Sub-macroblocks may also be broken into sub-macroblock partitions, of sizes 8×4, 4×8, and 4×4. An encoder may select how to divide the macroblock into partitions and sub-macroblock partitions based on the characteristics of a particular macroblock in order to maximize compression efficiency and subjective quality.
Multiple reference pictures may be used for Inter prediction, with a reference picture index coded to indicate which of the multiple reference pictures is used. In P pictures (or P slices), only single directional prediction is used, and the allowable reference pictures are managed in list 0. In B pictures (or B slices), two lists of reference pictures are managed, list 0 and list 1. In B pictures (or B slices), single directional prediction using either list 0 or list 1 is allowed, or bi-prediction using both list 0 and list 1 is allowed. When bi-prediction is used, the list 0 and the list 1 predictors are averaged together to form a final predictor.
Each macroblock partition may have independent reference picture indices, prediction type (e.g., list 0, list 1, bi-prediction), and an independent motion vector. Each sub-macroblock partition may have independent motion vectors, but all sub-macroblock partitions in the same sub-macroblock use the same reference picture index and prediction type.
It was proposed that intra prediction could be used for some of the partitions of an inter-coded macroblock. Because of complexity concerns, ultimately this flexibility was disallowed, and intra-coding mode is not allowed for individual macroblock partitions under the current standards. Some of the increased complexity in supporting both inter and intra coded partitions inside the same macroblock is due to the intra spatial directional prediction used in the JVT standard. Disallowing mixed inter/intra coding inside the same macroblock can hurt coding efficiency and especially subjective quality. For some blocks in an image, intra coding is more efficient than intra coding.
The Main and Extended profiles of the JVT standard provide a tool for weighted prediction. When weighted prediction is in use, a weighting factor and an offset are applied to inter predictions. For single directional prediction, the weighted predictor is formed as:SampleP=Clip1(((SampleP0·W0+2LWD−1)>>LWD)+O0);and for bi-directional prediction, the weighted predictor is formed as:SampleP=Clip1((SampleP0·W0+SampleP1·W1+2LWD)>>(LWD+1)+(O0+O1+1)>>1);where W0 and O0 are the list 0 reference picture weighting factor and offset, respectively, and W1 and O1 are the list 1 reference picture weighting factor and offset, and LWD is the log weight denominator-rounding factor. SampleP0 and SampleP1 are the list 0 and list 1 initial predictors, and SampleP is the weighted predictor. Weighting factors and offsets are optionally coded in the slice header and are associated with particular reference picture indices.
The relevant syntax elements in the JVT standard are:
luma_log_weight_denom, chroma_log_weight_denom, luma_weight_I0, chroma_weight_I0, luma_offset_I0, chroma_offset_I0, luma_weight_I1, chroma_weight_I1, luma_offset_I1, and chroma_offset_I1.
In addition, more than one reference picture index can be associated with a particular reference picture store by using reference picture reordering, which allows more than one weighting factor to be used while predicting from the same reference picture store.
The Joint Video Team (“JVT”) video compression standard explicitly supports 16×16 pixel macroblocks being divided into smaller sized macroblock partitions for inter coding, but does not support inter coding of some partitions of a macroblock and intra coding of other partitions of the same macroblock.