Recently, digital images/videos with a bit depth greater than eight are increasingly desirable in many application fields such as, for example, medical image processing, digital cinema workflows in production and postproduction, home theater related applications, and so forth. Bit depth scalability is potentially useful considering the fact that at some time in the future, conventional eight bit depth and high bit depth digital imaging systems will simultaneously exist. There are several ways to handle the coexistence of an 8-bit video and a 10-bit video. In a first prior solution, only a 10-bit coded bit-stream is transmitted and the 8-bit representation for standard 8-bit display devices is obtained by applying tone mapping methods to the 10-bit presentation. In a second prior art solution, a simulcast bit-stream is transmitted that includes an 8-bit coded presentation and 10-bit coded presentation. It is the preference of the decoder to choose which bit depth to decode. For example, a 10-bit capable decoder can decode and output a 10-bit video while a normal decoder supporting only 8-bits can just output an 8-bit video. The first prior art solution is inherently incompliant with the 8-bit profiles of the International Organization for Standardization/International Electrotechnical Commission (ISO/IEC) Moving Picture Experts Group-4 (MPEG-4) Part 10 Advanced Video Coding (AVC) standard/International Telecommunication Union, Telecommunication Sector (ITU-T) H.264 Recommendation (hereinafter the “MPEG-4 AVC Standard”). The second prior art solution is compliant to all the current standards but requires more overhead. However, a good tradeoff between the bit reduction and backward standard compatibility can be a scalable solution. Scalable video coding (SVC), which is also known as a scalable extension of the MPEG-4 AVC Standard, is considering the support of bit depth scalability. There are at least three advantages of bit depth scalable coding over post-processing or simulcast. One advantage is that bit depth scalable coding enables 10-bit video in a backward-compatible manner with the High Profiles of the MPEG-4 AVC Standard. A second advantage is that bit depth scalable coding enables adaptation to different network bandwidths or device capabilities. A third advantage is that bit depth scalable coding provides low complexity, high efficiency and high flexibility.
MPEG-4 AVC SVC Extension
In the current version of the SVC extension of the MPEG-4 AVC Standard, single-loop decoding is supported to reduce decoding complexity. The complete decoding, including motion-compensated prediction and deblocking, of the inter-coded macroblocks is only required for the current spatial or coarse-grain scalability (CGS) layer. This is realized by constraining the inter-layer intra texture prediction to those parts of the lower layer picture that are coded with intra macroblocks. To extend inter-layer intra texture prediction for bit depth scalability, inverse tone mapping is used. SVC also supports inter-layer residue prediction. Since tone mapping is typically used in the pixel domain, it is very hard to find a corresponding inverse tone mapping in the residue domain. In third and fourth prior art approaches, bit shift is used for inter-layer residue prediction.
In the joint draft 8 (JD8) of the scalable video coding (SVC) extension of the MPEG-4 AVC Standard, hereinafter also referred to as the third prior art approach, a technique referred to as smooth reference prediction (SRP) is proposed. A one-bit syntax element smoothed_reference_flag is sent when the syntax elements residual_prediction_flag and base_mode_flag are both set. When smoothed_reference_flag=1, the following steps are taken at the decoder to obtain the reconstructed video block:
1. The prediction block P is obtained using the enhancement layer reference frames and up-sampled motion vectors from the base layer.
2. The corresponding base layer residual block rb is up-sampled and U(rb) is added to P to form P+U(rb).
3. A smoothing filter with tap [1,2,1] is applied, first in the horizontal direction and then in the vertical direction, to obtain S(P+U(rb)).
4. The enhancement layer residual block is added to the result of immediately preceding step (3) to obtain the reconstruction block R=S(P+U(rb))+re.
Further, a fourth prior art approach proposes techniques for inter-layer residue prediction for BDS (Bit Depth Scalability). The fourth prior art approach converts the inverse tone mapping problem from the residue domain to the pixel domain for inter-layer residue prediction. If inter-layer residue prediction is used, then inverse tone mapping is applied to the sum of the tone mapped motion compensated prediction and the up-sampled residue from the base layer. When inter-layer residue prediction is used, the following steps are taken at the decoder to obtain the reconstructed video block:
1. The prediction block P is obtained using the enhancement layer reference frames and then P is tone mapped into the base layer bit depth to obtain T(P).
2. The corresponding base layer residual block rb is spatially up-sampled and U(rb) is added to P to form T(P)+U(rb).
3. A filter is used to obtain S(T(P)+U(rb)).
4. Inverse tone mapping is then applied to obtain T−1(S(T(P)+U(rb))).
5. The enhancement layer residual block is added to the result of immediately preceding step (4) to obtain the reconstruction block R=T−1(S(T(P)+U(rb)))+re.
However, all of the preceding prior art solutions are deficient. For example, the third prior art approach cannot handle different bit depths in the enhancement and base layers, due to the lack of tone mapping and inverse tone mapping operations. Moreover, with respect to the fourth prior art approach, there is room for improvement in the accuracy of the enhancement layer prediction.