3D video coding has been developed for conveying video data of multiple views simultaneously captured by multiple cameras. Since the cameras capture the same scene from different viewpoints, multi-view video data contains a large amount of inter-view redundancy. To exploit the inter-view redundancy, various coding tools utilizing disparity vectors (DVs) have been integrated to conventional 3D-HEVC (High Efficiency Video Coding) or 3D-AVC (Advanced Video Coding) codec as follows.
Disparity-Compensated Prediction (DCP)
To share the previously encoded texture information of reference views, the well-known technique of disparity-compensated prediction (DCP) has been added as an alternative to motion-compensated prediction (MCP). MCP refers to an inter-picture prediction that uses already coded pictures of the same view in a different access unit (i.e., different time), while DCP refers to an inter-picture prediction that uses already coded pictures of other views in the same access unit, as illustrated in FIG. 1. The vector (110) used for DCP is termed as disparity vector (DV), which is analog to the motion vector (MV) used in MCP. FIG. 1 illustrates three MVs (120, 130 and 140) associated with MCP. Moreover, the DV of a DCP block can also be predicted by the disparity vector predictor (DVP) candidate derived from neighboring blocks or the temporal collocated blocks that also use inter-view reference pictures.
Inter-View Motion Prediction (IVMP)
To share the previously encoded motion information of reference views, the inter-view motion prediction (IVMP) is employed. For deriving candidate motion parameters for a current block in a dependent view, a DV for a current block is firstly derived, and then the prediction block in the already coded picture in the reference view is located by adding the DV to the location of the current block. If the prediction block is coded using MCP, the associated motion parameters can be used as candidate motion parameters for the current block in the current view. The derived DV can also be directly used as an inter-view DV Merge candidate for current PU. It is noted that, in the current draft standard, when storing the derived DV as an inter-view DV Merge candidate, the associated reference index is related to the first inter-view reference picture in each reference picture list. This may introduce misalignment between the derived DV and the associated reference picture since the inter-view reference picture pointed by the derived DV may not be the same as the associated reference index related to the first inter-view reference picture in each reference picture list.
Inter-View Residual Prediction (IVRP)
To share the previously encoded residual information of reference views, the residual signal for current block can be predicted by the residual signal of the corresponding blocks in reference views. The corresponding block in a reference view is located by a DV.
View Synthesis Prediction (VSP)
View synthesis prediction (VSP) is a technique to remove inter-view redundancies among video signal from different viewpoints, in which synthetic signal is used as reference to predict a current picture.
In 3D-HEVC test model, HTM-7.0, there exists a process to derive a disparity vector predictor, known as NBDV (Neighboring Block Disparity Vector). The disparity vector derived from NBDV is then used to fetch a depth block in the depth image of the reference view. The fetched depth block has the same size of the current prediction unit (PU), and it will be used to perform backward warping for the current PU.
In addition, the warping operation may be performed at a sub-PU level precision, such as 2×2 or 4×4 blocks. A maximum depth value is selected for a sub-PU block to convert to a converted DV and used for warping all the pixels in the sub-PU block. The VSP is only applied for coding the texture component.
In current implementation, VSP prediction is added as a new merging candidate to signal the use of VSP prediction. In such a way, a VSP block may be coded in Skip mode without transmitting any residual, or a Merge block with residual information coded. In both modes, the motion information will be derived from the VSP candidate.
When a picture is coded as B picture and the current block is signaled as VSP predicted, the following steps are applied to determine the prediction direction of VSP.                Obtain the view index refViewIdxNBDV of the derived disparity vector from NBDV;        Obtain the reference picture list RefPicListNBDV (either RefPicList0 or RefPicList1) that is associated with the reference picture with view index refViewIdxNBDV;        Check the availability of an inter-view reference picture with view index refViewIdx which is not equal to refViewIdxNBDV in the reference picture list other than RefPicListNBDV;                    If such a different inter-view reference picture is found, bi-directional VSP is applied. The depth block from view index refViewIdxNBDV is used as the current block's depth information (in case of texture-first coding order), and the two different inter-view reference pictures (one from each reference picture list) are accessed via backward warping process and further weighted to achieve the final backward VSP predictor; and            Otherwise, uni-directional VSP is applied with RefPicListNBDV as the reference picture list for prediction.                        
When a picture is coded as P picture and the current prediction block is using VSP, uni-directional VSP is applied. It is noted that, the VSP process in the current 3D-HEVC draft and in the software HTM-7.0 are different. The two different VSP operations are described as follow.
VSP in 3D-HEVC Working Draft (HTM-7.0)                Derivation process for a VSP Merge candidate:                    Obtain the reference picture list RefPicListX (either RefPicList0 or RefPicList1) which includes an inter-view reference picture with ViewIdx of RefPicListX[refIdxLX ] is equal to the refViewIdxNBDV. If so, predFlagLXVSP is set to 1, mvLXVSP is set to NBDV and refIdxLXVSP is set to refIdxLX;            Check if an inter-view reference picture exists in RefPicListY (Y=1−X) with ViewIdx of RefPicListY[refIdxLY] different from refViewIdxNBDV. If so, predFlagLYVSP is set to 1, mvLYVSP is set to NBDV and refIdxLYVSP is set to refIdxLY.                        Decoding process for inter prediction samples when VSP mode flag is true:        For X in the range from 0 to 1:                    If predFlagLXVSP is equal to 1, the inter-view reference picture or pictures of RefPicListX[refIdxLX ] are accessed via backward warping process.                        View synthesis prediction process:        For X in the range from 0 to 1:                    The texture picture is derived by RefPicListX[refIdxLX ]; and            The disparity vector mvLX is then used to fetch a depth block in the depth image of the reference view with view index refViewIdxNBDV of the current CU.                        
VSP in HTM7.0                Derivation process for a view synthesis prediction Merge candidate:                    predFlagL0VSP is set to 1, and predFlagL1VSP is set to 0;            If there exists an inter-view reference picture in RefPicList0 and the reference picture, RefPicList0 (refIdxL0) is the first inter-view reference picture in RefPicList0, mvL0VSP is set to NBDV, and refIdxL0 VSP is set to refIdxL0; and            Similarly, if there exists an inter-view reference picture in RefPicList1 and the reference picture, RefPicList1 (refIdxL1) is the first inter-view reference picture in RefPicList1, mvL1VSP is set to NBDV, and refIdxL1 VSP is set to refIdxL1.                        Decoding process for inter prediction samples when VSP mode flag is true:                    RefPicListX is the first reference picture list which includes an inter-view reference picture, refIdxLXVSP is associated with the first inter-view reference picture in RefPicListX, and then backward warping process is performed in this direction; and            For B-slice, an additional test to determine whether B-VSP is performed. The additional test checks if the inter-view reference picture in list RefPicListY (Y=1−X) with view index is different from ViewIdx of RefPicListX[refIdxLXVSP].                        View synthesis prediction process:                    RefPicListX is the first reference picture list which includes an inter-view reference picture, refIdxLXVSP is associated with the first inter-view reference picture in RefPicListX, the disparity vector mvLX is then used to fetch a depth block in the depth image RefPicListX(refIdxLXVSP); and            The reference index of the texture picture is derived using a process similar to that specified in the decoding process for inter prediction samples when VSP mode flag is true.                        
The DV is critical in 3D video coding for inter-view motion prediction since inter-view residual prediction, disparity-compensated prediction (DCP), view synthesis prediction (VSP) needs the DV to indicate the correspondence between inter-view pictures. The DV derivation utilized in current test model of 3D-HEVC (HTM-7.0) is described as follow.
DV Derivation in HTM-7.0
Currently, except for the DV for DCP, the DVs used for the other coding tools are derived using either the neighboring block disparity vector (NBDV) or the depth oriented neighboring block disparity vector (DoNBDV) as described below.
Neighboring Block Disparity Vector (NBDV)
First, the temporal neighboring blocks located in a temporal collocated picture shown in FIG. 2A, are scanned in the following order: right-bottom block (RB) followed by the center block (BCTR) of the collocated block (210). Once a block is identified as having a DV, the checking process will be terminated. It is noted that, in the current design, two collocated pictures will be checked. Deriving DV from a temporal neighboring block of a collocated picture, whenever a temporal neighboring block is identified as having a DV, the DV is directly used as the derived DV for current CU regardless whether an inter-view reference picture with the view index of the inter-view reference picture pointed by the DV exists or not in either of the reference list of current CU.
If no DV can be found in temporal neighboring blocks, the spatial neighboring blocks of the current block (220) are checked in a given order (A1 , B1, B0, A0, B2) as shown in FIG. 2B.
If DCP coded block is not found in the above mentioned spatial and temporal neighboring blocks, the disparity information obtained from spatial neighboring DV-MCP blocks are used. FIG. 3 shows an example of the DV-MCP block whose motion is predicted from a corresponding block in the inter-view reference picture where the location of the corresponding blocks is determined according to a disparity vector. In FIG. 3, a corresponding block 320 in an inter-view reference picture for a current block 310 in a dependent view is determined based on a disparity vector 330. The disparity vector used in the DV-MCP block represents a motion correspondence between the current and inter-view reference picture.
To indicate whether a MCP block is DV-MCP coded or not and to save the disparity vector used for the inter-view motion parameters prediction, two variables are added to store the motion vector information of each block:                dvMcpFlag, and        dvMcpDisparity (only horizontal component is stored).        
When dvMcpFlag is equal to 1, the dvMcpDisparity is set to the disparity vector used for the inter-view motion parameter prediction. In the AMVP and Merge candidate list construction process, the dvMcpFlag of the candidate is set to 1 only for the candidate generated by inter-view motion parameter prediction in Merge mode and is set to 0 for others. It is noted that, if neither DCP coded blocks nor DV-MCP coded blocks are found in the above mentioned spatial and temporal neighboring blocks, a zero vector can be used as a default disparity vector.
Depth Oriented Neighboring Block Disparity Vector (DoNBDV)
In the DoNBDV scheme, NBDV is used to retrieve the virtual depth in the reference view to derive a refined DV. To be specific, the refined DV is converted from the maximum depth in the virtual depth block which is located by the DV derived using NBDV.
In HEVC, two different modes for signaling the motion parameters for a block are specified. In the first mode, which is referred to as adaptive motion vector prediction (AMVP) mode, the number of motion hypotheses, the reference indices, the motion vector differences, and indications specifying the used motion vector predictors are coded in the bitstream. The second mode is referred to as Merge mode. For this mode, only an indication is coded, which signals the set of motion parameters that are used for the block. It is noted that, in the current design of 3D-HEVC, during collecting the motion hypotheses for AMVP, if the reference picture type of spatial neighboring block is same as the reference picture type of a current PU (inter-view or temporal) and the picture order count (POC) of the reference picture of spatial neighboring block is equal to the POC of the reference picture of the current PU, the motion information of spatial neighboring block is directly used as the motion hypothesis of the current PU.
Under the existing scheme, when the reference view is selectable similar to IBP (i.e., I-Picture, B-Picture and P-Picture) configuration, non-correspondence between DV and the inter-view reference picture may occur in the following 3D coding tools: the derived DV stored as a candidate DV for DCP, VSP and AMVP. There also exists an issue when deriving DV from the temporal neighbor block in the NBDV process. Therefore, it is desirable to develop new DV derivation process and reference picture selection process to overcome these issues.