In 3D video, depth data is usually represented as set of depth maps that correspond to each frame of the texture video. Intensity of each point of the depth map describes the distance from the camera of the visual scene represented by this point. Alternatively, a disparity map may be used, which values are inversely proportional to the ones of the depth map and can be used to derive the depth maps.
In 3D video coding, a depth map for each view needs to be encoded besides the conventional video data. These depth maps show different signal characteristics compared to video data as they contain piecewise smooth regions bounded by strong edges. As depth maps are often estimated from texture data or are pre-processed, their histogram might be relatively sparse. As a result, a Depth Lookup Table (DLT) was proposed [F. Jäger, “3D-CE6.h Results on Simplified Depth Coding with an optional Depth Lookup Table,” Joint Collaborative Team on 3D Video Coding Extension Development (JCT-3V) of ITU-T VCEG and ISO/IEC MPEG, Shanghai, China, JCT3V-B0036, 2012] to exploit the histogram characteristics by only signaling difference indexes of the DLT instead of signaling the residual depth values themselves. By this approach the bit depth of these residual values can be reduced, which consequently results in higher coding efficiency.
The DLT is constructed at the encoder by analyzing the histogram of the original, uncompressed depth map. This DLT is afterwards transmitted to the decoder to allow for the mapping of indexes to actual depth values. Histogram values of the depth maps may vary over time and therefore there is a requirement for an update mechanism. Moreover, in a multi-view coding scenario, multiple depth maps may have different depth map histograms and in these cases such an update mechanism is also beneficial to the overall coding performance.
In the latest specification of the 3D extension for High Efficiency Video Coding [G. Tech, K. Wegner, Y. Chen, S. Yea, “3D-HEVC test model 2,” Document of Joint Collaborative Team on 3D Video Coding Extension Development, JCT3V-B1005, October, 2012], the DLT is only sent once per sequence in Sequence Parameter Set (SPS), separately for all views. This approach keeps the overhead for the DLT signaling relatively low.
It was also proposed to signal the DLT in the slice header of each I-Slice of the base view [I. Lim, H. C. Wey, and D. S. Park, “3D-CE6.h Related: Improved depth lookup table (DLT),” Joint Collaborative Team on 3D Video Coding Extension Development (JCT-3V) of ITU-T VCEG and ISO/IEC MPEG, Geneva, Switzerland, JCT3V-00093, 2013]. In this approach, the DLT values are updated more regularly in the temporal direction to allow for histogram changes over time. In this case, all the dependent views inherit the base view's DLT as it is assumed that the depth map histogram over all views is the same.
Also, another method to signal DLT values, called range constrained bit map (RCBM) coding 800 as depicted in FIG. 8, was proposed in [Kai Zhang, Jicheng An, Shawmin Lei, “3D-CE6.h related: An efficient coding method for DLT in 3DVC”, Document of Joint Collaborative Team on 3D Video Coding Extension Development, JCT3V-00142, January, 2013]. The method 800 uses signaling of a range of depth values that are present in a DLT (see FIG. 8): min_dlt_value and diff_max_dlt_value are coded as unsigned integer to constrain the value range of DLT. The smallest value in DLT is min_dlt_value, and the largest is MaxDltValue, which equals to min_dlt_value+diff_max_dlt_value. Then, the binary string bit_map_flag is used to signal whether the depth value within the range is present in the DLT or not. If a bit in the bit_map_flag is equal to 1 the depth value corresponding to this position in the binary string belongs to or occurs in the DLT, otherwise the depth value does not belong to or does not occur in the DLT.
Prior art encoding methods for DLT signaling do not fully utilize characteristics of the signal and, consequently, a possibility to further increase coding efficiency of DLT exists.
Signaling the DLT only once per sequence and for each view separately results in a very low overhead for the DLT values, but is relatively inflexible in terms of temporal and spatial (inter-view) updating.
The alternative solution, to signal the DLT in the slice header of I-Slices for the base view and inherit that DLT for the dependent views, lacks the ability to update the lookup table in the temporal direction more regularly and also does not allow for inter-view update of the DLT. The assumption that the DLT values are always the same for all coded views is in many cases too restrictive and results in reduced depth map quality in the dependent views. If a dependent view's depth map shows different histogram characteristics compared to the base view's, then the reconstruction of that depth map cannot even reach all original depth values due to the plain copy of the non-optimal DLT.