The International Organization for Standardization/International Electrotechnical Commission (ISO/IEC) Moving Picture Experts Group-4 (MPEG-4) Part 10 Advanced Video Coding (AVC) standard/international Telecommunication Union, Telecommunication Sector (ITU-T) H.264 standard (hereinafter the “MPEG4/H.264 standard” or simply the “H.264 standard”) is the first international video coding standard to include a Weighted Prediction (WP) tool. The scalable video coding (SVC) standard, which is currently being developed as an amendment of the H.264 standard (and is thus also interchangeably referred to herein as the “H.264 standard”), also adopts weighted prediction. However, the H.264 standard does not specify which weights (base layer or enhancement layer) should be used when using (scaled) base layer motion vector as predictors.
Moreover, during inter-layer prediction, the enhancement layer can use the motion vectors (via motion_prediction_flag_lx[ ]) and residual data (via residual_prediction_flag) from the base layer as predictors. However, when different weighting parameters are used in the enhancement layer and base layer for a given macroblock, residual prediction is not practical due to the different weights used for the different layers of the same macroblock.
Weighted Prediction is supported in the Main, Extended, and High profiles of the H.264 standard. The use of WP is indicated in the sequence parameter set for P and SP slices using the weighted_pred_flag field, and for B slices using the weighting_bipred_idc field. There are two WP modes, an explicit mode and an implicit mode. The explicit mode is supported in P, SP, and B slices. The implicit mode is supported in only B slices.
weighted_pred_flag equal to 0 specifies that weighted prediction shall not be applied to P and SP slices. weighted_pred_flag equal to 1 specifies that weighted prediction shall be applied to P and SP slices.
weighted_bipred_idc equal to 0 specifies that the default weighted prediction shall be applied to B slices. weighted_bipred_idc equal to 1 specifies that explicit weighted prediction shall be applied to B slices. weighted_bipred_idc equal to 2 specifies that implicit weighted prediction shall be applied to B slices. The value of weighted_bipred_idc is in the range of 0 to 2, inclusive.
A single weighting factor and offset are associated with each reference index for each color component in each slice. In explicit mode, these WP parameters may be coded in the slice header. In implicit mode, these WP parameters are derived based only on the relative distance of the current picture and its reference pictures.
For each macroblock or macroblock partition, the weighting parameters applied are based on a reference picture index (or indices in the case of bi-prediction) of the current macroblock or macroblock partition. The reference picture indices are either coded in the bitstream or may be derived, e.g., for skipped or direct mode macroblocks. The use of the reference picture index to signal which weighting parameters to apply is bitrate efficient, as compared to requiring a weighting parameter index in the bitstream, since the reference picture index is already available based on the other required bitstream fields.
Many different methods of scalability have been widely studied and standardized, including SNR scalability, spatial scalability, temporal scalability, and fine grain scalability, in scalability profiles of the MPEG-2 and H.264 standards, or are currently being developed as an amendment of the H.264 standard.
For spatial, temporal and SNR scalability, a large degree of inter-layer prediction is incorporated. Intra and inter macroblocks can be predicted using the corresponding signals of previous layers. Moreover, the motion description of each layer can be used for a prediction of the motion description for following enhancement layers. These techniques fall into three categories: inter-layer intra texture prediction, inter-layer motion prediction and inter-layer residue prediction (via residual_prediction_flag).
In JSVM2.0, an enhancement layer macroblock can exploit inter-layer motion prediction using scaled base layer motion data, using either “BASE_LAYER_MODE” or “QPEL_REFINEMENT_MODE”, as in case of dyadic (two-layer) spatial scalability. In addition, in macroblock (or submacroblock) prediction mode, the predictor of a motion vector can choose from a base_layer motion vector or an enhancement layer motion vector from a spatial neighbor, via motion_prediction_flag_lx[ ]. motion_prediction_flag_lx[ ] equal to 1 specifies that the (scaled) base layer motion vector are used as motion vector predictors. motion_prediction_flag_lx[ ] equal to 0 specifies that enhancement layer motion vector from spatial neighbors are used as motion vector predictors. If the enhancement layer and its previous layer have different pred_weight_table( ) values, for the case where motion_prediction_flag_lx[ ] is equal to 1, the H.264 standard does not specify which set of weights is to be used for the enhancement layer.
In JSVM2.0, an enhancement layer macroblock can exploit inter-layer residue prediction using (upsampled) base layer residue via the residue_prediction_flag. residue_prediction_flag equal to 1 specifies that the residual signal is predicted from the (upsampled) reconstructed residual signal of the base macroblock or sub-macroblock. residue_prediction_flag equal to 0 specifies that the residue signal is not predicted. However, the H.264 standard does not consider the fact that the two layers could have different sets of pred_weight_table( ). As a result, the use of residual prediction may result in lower coding efficiency when the enhancement layer uses a different set of weights from the base layer (for a given macroblock) because, for such a case, the residue_prediction_flag will seldom be set to 1. The transmission of the residue_prediction_flag for this case would only result in the wasting of bits.