Three-dimensional (3D) television has been a technology trend in recent years that intends to bring viewers sensational viewing experience. Various technologies have been developed to enable 3D viewing. Among them, the multi-view video is a key technology for 3DTV application among others. The traditional video is a two-dimensional (2D) medium that only provides viewers a single view of a scene from the perspective of the camera. However, the 3D video is capable of offering arbitrary viewpoints of dynamic scenes and provides viewers the sensation of realism.
The 3D video is typically created by capturing a scene using video camera with an associated device to capture depth information or using multiple cameras simultaneously, where the multiple cameras are properly located so that each camera captures the scene from one viewpoint. The texture data and the depth data corresponding to a scene usually exhibit substantial correlation. Therefore, the depth information can be used to improve coding efficiency or reduce processing complexity for texture data, and vice versa. For example, the corresponding depth block of a texture block reveals similar information corresponding to the pixel level object segmentation. Therefore, the depth information can help to realize pixel-level segment-based motion compensation. Accordingly, a depth-based block partitioning (DBBP) has been adopted for texture video coding in the current 3D-HEVC (3D video coding based on the High Efficiency Video Coding (HEVC) standard).
The current depth-based block partitioning (DBBP) comprises steps of virtual depth derivation, block segmentation, block partition, and bi-segment compensation. First, virtual depth is derived for the current texture block using a disparity vector from neighboring blocks (NBDV). The derived disparity vector (DV) is used to locate a depth block in a reference view from the location of the current texture block. The reference view may be a base view. The located depth block in the reference view is then used as a virtual depth block for coding the current texture block. The virtual depth block is to derive block segmentation for the collocated texture block, where the block segmentation can be non-rectangular. A mean value, of the virtual depth block is determined. A binary segmentation mask is generated for each pixel of the block by comparing the virtual depth value with the mean value, d. The mean value is utilized to compare with each virtual depth value to generate the mask values. If the left-up corner virtual depth value is larger than the mean value, all segmentation mask values corresponding to the depth values larger than d are 0; and all the segmentation mask values corresponding to the depth values less than d are 1. FIGS. 1A-B illustrates an example of block segmentation based on the virtual block. In FIG. 1A, corresponding depth block 120 in a reference view for current texture block 110 in a dependent view is located based on the location of the current texture block and derived DV 112, which is derived using NBDV according to 3D-HEVC. The mean value of the virtual block is determined in step 140. The values of virtual depth samples are compared to the mean depth value in step 150 to generate segmentation mask 160. The segmentation mask is represented in binary data to indicate whether an underlying pixel belongs to segment 1 or segment 2, as indicated by two different line patterns in FIG. 1B.
In order to avoid high computational complexity associated with pixel-based motion compensation, DBBP uses block-based motion compensation. Each texture block may use one of 6 non-square partitions consisting of 2N×N, N×2N, 2N×nU, 2N×nD, nL×2N and nR×2N, where the latter four block partitions correspond to AMP (asymmetric motion partition). After a block partition is selected from these block-partition candidates by block partition selection process, two predictive motion vectors (PMVs) are derived for the partitioned blocks respectively. The PMVs are then utilized for compensating the to-be-divided two segments. According to the current 3D-HEVC, the best block partition is selected by comparing the segmentation mask and the negation of the segmentation mask (i.e., the inverted segmentation mask) with the 6 non-square partition candidates (i.e., 2N×N, N×2N, 2N×nU, 2N×nD, nL×2N and nR×2N). The pixel-by-pixel comparison counts the number of so-called matched pixels between the segmentation masks and the block partition patterns. There are 12 sets of matched pixels need to be counted, which correspond to the combinations of 2 complementary segmentation masks and 6 block partition types. The block partition process selects the candidate having the largest number of matched pixels. FIG. 2 illustrates an example of block partition selection process. In FIG. 2, the 6 non-square block partition types are superposed on top of the segmentation mask and the corresponding inverted segmentation mask. A best matching partition between a block partition type and a segmentation mask is selected as the block partition for the DBBP process.
After a block partition type is selected, two predictive motion vectors can be determined. Each of the two predictive motion vectors is applied to the whole block to form a corresponding prediction block. The two prediction blocks are then merged into one on a pixel by pixel basis according to the segmentation mask and this process is referred as bi-segment compensation. FIG. 3 illustrates an example of DBBP process. In this example, the N×2N block partition type is selected and two corresponding motion vectors (MV1 and MV2) are derived for two partitioned blocks respectively. Each of the motion vectors is used to compensate a whole texture block (310). Accordingly, motion vector MV1 is applied to texture block 320 to generate prediction block 330 according to motion vector MV1, and motion vector MV2 is applied to texture block 320 also to generate prediction block 332 according to motion vector MV2. The two prediction blocks are merged by applying respective segmentation masks (340 and 342) to generate the final prediction block (350).
While the DBBP process reduces computational complexity by avoiding pixel-by-pixel based motion compensation, problems still exist in the steps of block partition and block segmentation. One issue is associated with the mean value calculation for block partition and block segmentation. The steps utilize different mean value calculations for block partition and block segment. For block partition, the mean value is determined based on the average of all the upper-left corner pixels of the 4×4 sub-blocks in the corresponding depth block. On the other hand, for block segmentation, the mean value is determined according to the average of all pixels of the corresponding depth block. The two different mean value calculations in DBBP will inevitably increase the encoding and decoding complexity. Another issue is associated with the high computational complexity involved in the block partition processing. However, this step is only utilized to derive suitable motion vectors from more reliable block partitioning. The block partition type doesn't play any role in generating the final prediction block after the motion vectors are derived as evidenced in FIG. 3. A further issue associated with the block partitioning is the large number of partition types due to the use of AMP. The current practice determines whether to utilize AMP partitions directly based on the CU size. The use of AMP may not necessarily provide noticeable improvement in system performance. Therefore, it is desirable to develop means to overcome these issues mentioned here.