Three-dimensional (3D) television has been a technology trend in recent years that intends to bring viewers sensational viewing experience. Various technologies have been developed to enable 3D viewing. Among them, the multi-view video is a key technology for 3DTV application among others. The traditional video is a two-dimensional (2D) medium that only provides viewers a single view of a scene from the perspective of the camera. However, the multi-view video is capable of offering arbitrary viewpoints of dynamic scenes and provides viewers the sensation of realism.
The multi-view video is typically created by capturing a scene using multiple cameras simultaneously, where the multiple cameras are properly located so that each camera captures the scene from one viewpoint. Accordingly, the multiple cameras will capture multiple video sequences corresponding to multiple views. In order to provide more views, more cameras have been used to generate multi-view video with a large number of video sequences associated with the views. Accordingly, the multi-view video will require a large storage space to store and/or a high bandwidth to transmit. Therefore, multi-view video coding techniques have been developed in the field to reduce the required storage space or the transmission bandwidth.
A straightforward approach may be to simply apply conventional video coding techniques to each single-view video sequence independently and disregard any correlation among different views. Such coding system would be very inefficient. In order to improve efficiency of multi-view video coding, typical multi-view video coding exploits inter-view redundancy. Therefore, most 3D Video Coding (3DVC) systems take into account of the correlation of video data associated with multiple views and depth maps. Some 3D video coding standards are being developed by extending existing standards intended for 2D video. For example, there are emerging 3D coding standards based on AVC (Advanced video coding) and HEVC (High Efficiency Video Coding). In these standards, disparity vector (DV) has been widely used in various coding applications to locate a corresponding block in a reference view.
In the AVC-based 3D coding (3D-AVC), the disparity vector for a current texture block is derived differently for different coding tools. For example, the maximum of the depth block associated with a current texture block is used as the disparity vector for the current texture block when VSP (View Synthesis Prediction) or DMVP (Depth-based Motion Vector Prediction) coding mode is selected. On the other hand, the disparity vector derived based on neighboring blocks is used for disparity-based Skip and Direct modes. In this disclosure, both motion vector (MV) and disparity vector (DV) are considered part of motion information associated with a block. Motion vector prediction may refer to prediction for MV as well as for DV.
In the DV derivation approach based on the maximum of the depth block, the disparity vector for currently coded texture block Cb is derived from the depth block, d(Cb) associated with the current texture block Cb. Depth map samples located at four corners (i.e., top-left, top-right, bottom-left, bottom-right) of the depth block, d(Cb) are compared. The maximal depth map value among the four depth values is converted to disparity value according to a camera model. In the case of depth map with reduced resolution, spatial coordinates of the four corners of the current texture block are downscaled to match the depth map resolution.
In the DV derivation approach based on neighboring blocks, the disparity vector is derived from motion information of neighboring blocks of the current block Cb. If the motion information is not available from neighboring blocks, the disparity vector is derived from the associated depth block d(Cb) according to the DV derivation approach based on the maximum of the depth block. In a system where the depth map in a dependent view is coded before the corresponding texture picture, the associated depth block may correspond to a depth block in the current dependent view. Otherwise, the associated depth block may correspond to a depth block in a reference view that has been coded before the corresponding texture picture.
FIG. 1 illustrates the neighboring blocks used to derive the disparity vector and the usage of the derived disparity vector. The disparity vector is derived from the motion vectors of neighboring blocks A, B, and C (D) of a current block (110), where block D is used when block C is not available. If only one of the neighboring blocks was coded with inter-view prediction (i.e., having a DV), the DV is selected as the derived DV for the current block Cb. If multiple disparity vectors are available from blocks A, B, C (D), the derived DV is determined from the median of available DVs. If none of neighboring blocks A, B and C(D) has a valid DV, the derived DV is then determined from a converted DV, where the converted DV is obtained by converting a depth value of the depth block associated with the current texture block according to a camera model. The derived DV is then used to locate a corresponding block in the base view. The corresponding block (120) in the base view is determined by offsetting the center point (112) of the current point by the derived DV. The operation is similar to locate a reference block using motion compensation. Accordingly, the operation can be implemented using the existing motion compensation module by treating the derived DV as a motion vector (MV).
FIG. 2 illustrates an example of the flowchart for the neighboring blocks-based DV derivation for disparity-based Skip and Direct modes. The motion data associated with neighboring texture blocks A, B and C(D) are received as shown in step 210. When the motion information associated with block C is not available, motion information associated with block D is used. The motion data may correspond to inter-view motion data (i.e., DV) or temporal motion data (i.e., MV). The availability of inter-view motion information (i.e., DV) of neighboring blocks A, B and C(D) are checked in step 220. If only one neighboring block is coded with inter-view prediction, the motion information (i.e., DV) is used as the derived DV. If more than one neighboring block is coded with inter-view prediction, the unavailable DV from any neighboring is replaced by the max disparity as shown in step 230. The derived DV is determined from the median of the three candidate DVs as shown in step 240. After the derived DV is determined, a corresponding block in the reference view is located using the derived DV as shown in step 250. The motion vector of the corresponding block is then used for disparity-based Skip or Direct motion vector predictor.
Disparity vector is also used in other coding tools in 3D coding systems. For example, 3D-AVC also includes Direction-Separated MVP (DS-MVP) coding tool. In 3D-AVC, the median-based MVP is restricted to identical prediction directions of motion vector candidates. In DS-MVP, all available neighboring blocks are classified according to the direction of their prediction (i.e., temporal or inter-view).
For the inter-view prediction, if the current block Cb uses an inter-view reference picture, all neighboring blocks that do not utilize inter-view prediction are marked as unavailable for MVP. Motion vectors of the neighboring blocks marked as unavailable are replaced with disparity vector derived from depth data associated with Cb instead of a zero motion vector. The DV derived from the depth data is then included as a MV candidate for the median operation to determine the derived DV. The DV derived from the depth data associated with the current block Cb is derived according to the “maximal out of four corners”. The flowchart of the inter-view prediction process is shown in FIG. 3. The steps involved are substantially the same as the steps in FIG. 2 except that step 250 is not used. As shown in FIG. 3, after the derived DV is obtained, the derived DV is used as the MVP (i.e., inter-view prediction) for the current block.
For the Inter prediction, if the current block Cb uses temporal prediction, neighboring blocks that use inter-view reference frames are marked as unavailable for MVP. Motion vectors of the neighboring blocks marked as unavailable are replaced with a motion vector of a corresponding block in a reference view. The corresponding block is derived by applying a derived disparity vector to the coordinates of the current texture block. The derived disparity vector can be determined according to “maximal out of four corners”. If corresponding block is not coded with Inter prediction (i.e., no MV available), a zero vector is considered. The flowchart of the Inter prediction process is shown in FIG. 4. The motion data associated with neighboring blocks A, B and C(D) is received in step 410. The motion data in this case corresponds to temporal motion (i.e., MV). The availability of MV of the neighboring blocks is checked in step 420. If any neighboring block is not temporal predicted, (i.e., no MV), the MV is replaced by a derived MV as shown in step 430. The derived MV corresponds to the MV of a corresponding block in a reference view as shown in step 450 and the corresponding block is located by the derived DV determined according to “maximal out of four corners”. The derived MV is determined based on the median of candidate MVs as shown in step 440.
Disparity-based Skip/Direct mode is another coding tool in 3D-AVC. In the Skip/Direct modes, motion information is not coded. Instead the motion information is derived at both encoder and decoder sides through an identical process. Motion information for coding of the current block Cb in Skip/Direct modes is derived from motion information of the corresponding block in the base view. The correspondence between the current block Cb and the corresponding block in the base view is established through a disparity vector by applying the DV to the central point of the current block Cb as shown in FIG. 1. A motion partition referenced by this vector in the base view provides motion information (i.e., reference index and motion vectors) for coding of the current block Cb.
The disparity derivation procedure for this mode is referred as “Neighboring blocks based derivation” and the process can be illustrates using the flowchart in FIG. 2. If the corresponding block in the base view is not available, the direction-separated MVP (DS-MVP) as mentioned earlier is used to derive the DV by setting the reference index to zero.
As described above, DV derivation is critical in 3D and multi-view video coding. The DV derivation process used in the current 3D-AVC is quite complicated. It is desirable to simplify the DV derivation process.