Three-dimensional (3D) television has been a technology trend in recent years that is targeted to bring viewers sensational viewing experience. Multi-view video is a technique to capture and render 3D video. The multi-view video is typically created by capturing a scene using multiple cameras simultaneously, where the multiple cameras are properly located so that each camera captures the scene from one viewpoint. The multi-view video with a large number of video sequences associated with the views represents a massive amount data. Accordingly, the multi-view video will require a large storage space to store and/or a high bandwidth to transmit. Therefore, multi-view video coding techniques have been developed in the field to reduce the required storage space and the transmission bandwidth. A straightforward approach may simply apply conventional video coding techniques to each single-view video sequence independently and disregard any correlation among different views. Such straightforward techniques would result in poor coding performance.
In order to improve multi-view video coding efficiency, multi-view video coding always exploits inter-view redundancy. The disparity between two views is caused by the locations and angles of the two respective cameras. Since all cameras capture the same scene from different viewpoints, multi-view video data contains a large amount of inter-view redundancy. To exploit the inter-view redundancy, coding tools utilizing disparity vector (DV) have been developed for 3D-HEVC (High Efficiency Video Coding) and 3D-AVC (Advanced Video Coding). For example, DV is used as a temporal inter-view motion vector candidate (TIVC) in advanced motion vector prediction (AMVP) and Merge modes. DV is also used as a disparity inter-view motion vector candidate (DIVC) in AMVP and Merge modes. Furthermore, DV is used for inter-view residual prediction (IVRP) and view synthesis prediction (VSP).
Furthermore, Illumination Compensation (IC) is a technique to reduce the intensity differences between views caused by the different light fields of two views captured by different cameras at different locations. In HTM, a linear IC model is disclosed by Liu et al. (“3D-CE2.h: Results of Illumination Compensation for Inter-View Prediction”, Joint Collaborative Team on 3D Video Coding Extension Development of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, 2nd Meeting: Shanghai, CN, 13-19 Oct. 2012, Document: JCT3V-B0045) to compensate the illumination discrepancy between different views. Parameters in IC model are estimated for each Prediction Unit (PU) using available nearest reconstructed neighbouring pixels. Therefore, there is no need to transmit the IC parameters to the decoder. Whether to apply IC or not is decided at the coding unit (CU) level, and an IC flag is coded to indicate whether IC is enabled at the CU level. The flag is present only for the CUs that are coded using inter-view prediction. If IC is enabled for a CU and a PU within the CU is coded by temporal prediction (i.e., Inter prediction), the PU block is inferred to have IC disabled. The linear IC model used in inter-view prediction is shown in eqn. (1):p(i,j)=aIC·r(i+dvx, j+dvy)+bIC where (i,j)∈PUc  (1)where PUc is the current PU, (i, j) is the pixel coordinate in PUc, (dvx, dvy) is the disparity vector of PUc, p(i, j) is the prediction of PUc, r(⋅,⋅) is the reference picture of PU from a neighboring view, and aIC and bIC are parameters of the linear IC model.
Moreover, in order to provide adaptive IC in the slice-level, the encoder can decide whether the IC should be applied to a current picture and transmit the decision to decoder. A one-bit flag can be encoded in the slice header of the first slice to indicate whether IC is enabled for the first slice and its subsequent slices in the picture. An example of decision process for IC decision is shown as follows.
1) Form pixel intensity histograms of the current picture and the inter-view reference original picture.
2) Calculate SAD between the two histograms.
3) If the SAD is over a threshold, the enable IC flag is set to 1;
4) Otherwise, the IC enable flag is set to 0.
The pixel intensity distributions of the current and inter-view reference pictures are represented by histograms for each colour and the similarity of two distributions are measured by Sum of Absolute Differences (SAD) of the two histograms. The SAD is then compared with a threshold to determine whether to enable IC for the current picture. The threshold may be determined based on picture characteristics collected from underlying pictures or test pictures. When the IC is disabled for a picture, the encoder has no need to determine whether to apply illumination compensation to the CUs in the current picture. No CU-level flags need to be transmitted to the decoder in this case. Accordingly, unnecessary IC decision can be avoided in both the encoder and decoder sides.
While IC can provide significant coding gain, it may cause a parsing dependency issue according to the current HEVC-based Test Model (HTM). According to the existing HTM, ic_flag is only signalled for inter CUs, where inter-view prediction is used. The parser has to check whether inter-view reference data is used. If inter-view reference data is used, the parser will parse ic_flag for the current CU. Accordingly, ic_flag should always be parsed if the reference list contains only the inter-view reference pictures. On the other hand, it should never be parsed if the reference list contains only the inter-time reference pictures. There is no parsing dependency under these two situations.
The parsing problem may arise when the reference list contains both inter-view and inter-time (i.e., temporal) reference pictures. If all PUs in the current CU are coded in non-Merge mode (e.g., Advanced Motion vector Prediction (AMVP) mode), there is no parsing dependency since all the reference pictures used are explicitly signalled by reference indices for the non-Merge mode. However, according to the existing HTM, the reference picture used for a PU coded using Merge mode is not explicitly signalled. Instead, the reference index is derived from the selected merging candidate. Due to the pruning process in merging candidate list construction, the derived reference picture may depend on Motion Vectors (MV) in its neighbouring blocks. Since MVs in neighbouring blocks may come from a collocated picture, the derived reference picture may depend on the collocated picture indirectly. If the collocated picture is damaged (e.g., due to transmission error), a parsing problem for ic_flag may occur.
FIGS. 1A and 1B illustrate an example of parsing issue arising due to indirect parsing dependency. In this example, reference picture Ref 0 and reference picture Ref 1 are inter-time and inter-view reference pictures respectively. The current CU is coded in 2N×2N Merge mode and the selected Merge candidate is indicated by Merge index 1. The reference indices associated with the first three candidates are 0, 0, and 1 in this example. The MVs derived from the first two candidates denoted as MVa and MVb are equal in this example, i.e., MVa=MVb as shown in FIG. 1A. In addition, MVb is obtained by Temporal Motion Vector Prediction (TMVP) from the collocated picture. In the merging candidate pruning process, the second possible candidate is removed from the candidate list since it is equal to the first one. This process will result in a candidate list 110 in FIG. 1B. Therefore, Merge index 1 refers to the third original candidate (before the second candidate is removed) with an inter-view reference. As a result, ic_flag should be parsed for this CU if the collocated picture is corrected received at the decoder. However, if the collocated picture is damaged (e.g., due to transmission error), the candidate associated with the neighboring block MVb may be decoded incorrectly. This will cause MVa !=MVb and the second candidate will not be removed from the candidate list in this case. This will result in candidate list 120 as shown in FIG. 1B. Therefore, Merge index 1 will refer to the second possible candidate in this candidate list, which is an inter-time reference. Consequently, ic_flag will not be parsed for this CU according to the existing HTM and a parsing problem occurs.
Accordingly, it is desirable to develop error-resilient illumination compensation, where corresponding syntax parsing is more robust to errors. Furthermore, it is desirable that such error-resilient illumination compensation will not cause any noticeable impact on the system performance.