Three-dimensional (3D) television has been a technology trend in recent years that is targeted to bring viewers sensational viewing experience. Multi-view video is a technique to capture and render 3D video. The multi-view video is typically created by capturing a scene using multiple cameras simultaneously, where the multiple cameras are properly located so that each camera captures the scene from one viewpoint. The multi-view video with a large number of video sequences associated with the views represents a massive amount data. Accordingly, the multi-view video will require a large storage space to store and/or a high bandwidth to transmit. Therefore, multi-view video coding techniques have been developed in the field to reduce the required storage space and the transmission bandwidth. In three-dimensional and multi-view coding systems, the texture data as well as depth data are coded.
Simplified depth coding (SDC) and a depth lookup table (DLT) are adopted into HEVC (High Efficiency Video Coding) based Test Model (HTM). For each depth coding unit (CU), if SDC is selected, one of three different prediction modes, i.e., DC, Planar and DMM-1 can be selected. After the prediction, instead of coded as quantized transform coefficients, the SDC-coded residuals are represented by one or two constant residual values depending on whether the depth block is divided into one or two segments. Moreover, the DLT is used to map coded depth values in SDC to valid depth values of the original depth map.
A two-step approach is applied to obtain the prediction values in the SDC prediction stage. First, the normal Intra-prediction procedure using neighboring reconstructed samples is invoked to get all the prediction samples in the coded block. DC, Planar and DMM-1 are three possible prediction modes in this step. Second, the average value of the prediction samples in a segment is calculated as the prediction value for this segment. For DC and Planar modes, there is only one segment in the coded block. For DMM-1 (Depth Modelling Mode 1) mode, there are two segments in the coded block, as the defined by DMM-1 mode. As a simplification, a sub-sampling method by Zheng et al., (CE6.H related: Reference samples sub-sampling for SDC and DMM,” Document of Joint Collaborative Team on 3D Video Coding Extension Development, JCT3V-C0154, January 2013), which only uses one from each four prediction samples to get the average value. This method can significantly reduce the summation operations in the averaging process.
FIG. 1 illustrates an example of the two-step SDC prediction approach for the DC mode. The neighboring reconstructed depth values (112) of the current depth block (110) are used as reference samples to form the prediction samples for the current block. The average is derived from the prediction values. The two-step approach introduces a high computational overhead of generating prediction samples and calculating the average value over these samples. Furthermore, a high bit-width is required by the averaging process. In order to reduce the number of prediction samples involved in averaging, the sub-sampling method by Zheng et al., retains one out of four adjacent samples (120). A prediction value P is then derived for the block (130) to be coded or decoded. For decoding, the derived prediction value is added to the residue received to form the reconstructed block (140). In SDC, as many as 64×64/4 prediction samples may be summed together, thus 18 bits are required by the accumulator for 8-bit samples, which is larger than that required by normal intra-prediction.
FIG. 2 illustrates an example of the two-step SDC prediction approach for the DMM-1 mode. The neighboring reconstructed samples (212) of the current depth block (210) are used to form prediction. A subsamples prediction block (220) is used for prediction. The average values (P0 and P1) for each segment is derived to form the prediction block (230). For decoding, respective residues (R0 and R1) are received and added to corresponding prediction values to form the reconstructed block (240) as shown in FIG. 2.
It is desirable to develop process for derivation of the prediction value for each segment that can reduce the required operations or ease the requirement on bit depth to perform summation of a large number of samples.