The video coding standard H.264/AVC provides various video coding modes and dynamic selection between them according to rate-distortion optimization (RDO). Its extension for Scalable Video Coding (SVC) provides different layers and supports for spatial scalability either direct encoding of the enhancement layer (EL), or inter-layer prediction. In direct encoding of the EL, a mode called I_N×N, redundancy between layers is not used: the EL is purely intra coded.
Inter-layer prediction is used in two coding modes, namely I_BL if the base layer (BL) is intra-coded, and residual prediction if the BL is inter-coded, so that EL and EL residuals are generated. With residual prediction, an EL residual is predicted from the EL residual.
For intra-coded EL macroblocks (MBs), the SVC supports two types of coding modes, namely original H.264/AVC I_N×N coding (spatial prediction, base_mode_flag=0) and I_BL, a special SVC coding mode for scalability where an EL MB is predicted from a collocated BL MB.
For inter-coding, the first step is generating BL and EL differential images called residuals. Residual inter-layer prediction is done for encoding the difference between the BL residual and the EL residual.
In recent years, higher color depth than the conventional eight bit color depth is more and more desirable in many fields, such as scientific imaging, digital cinema, high-quality-video-enabled computer games and professional studio and home theatre related applications. Accordingly, the state-of-the-art video coding standard H.264/AVC has included Fidelity Range Extensions (FRExt), which support up to 14 bits per sample and up to 4:4:4 chroma sampling.
For a scenario with two different decoders, or clients with different requests for the bit depth, e.g. 8 bit and 12 bit for the same raw video, the existing H.264/AVC solution is to encode the 12-bit raw video to generate a first bit-stream, and then convert the 12-bit raw video to an 8-bit raw video and encode it to generate a second bitstream. If the video shall be delivered to different clients who request different bit depths, it has to be delivered twice, e.g. the two bitstreams are put in one disk together. This is of low efficiency regarding both the compression ratio and the operational complexity.
The European Patent application EP06291041 discloses a scalable solution to encode the whole 12-bit raw video once to generate one bitstream that contains an H.264/AVC compatible BL and a scalable EL. Due to redundancy reduction, the overhead of the whole scalable bitstream on the above-mentioned first bitstream is small compared to the additional second bitstream. If an H.264/AVC decoder is available at the receiving end, only the BL sub-bitstream is decoded, and the decoded 8-bit video can be viewed on a conventional 8-bit display device; if a bit depth scalable decoder is available at the receiving end, both the BL and the EL sub-bitstreams may be decoded to obtain the 12-bit video, and it can be viewed on a high quality display device that supports color depths of more than eight bit.