Currently a video coding standard known as the high-efficiently video coding (HEVC or ITU-T H.265) has been developed which may provide a substantially higher compression efficiency when compared to other coding standards such as H.264/AVC (MPEG-4 part 10 Advance Video Coding). Additionally, the scalable extensions of HEVC (SHVC) provides a layered HEVC-based video coding scheme that is comparable to the scalable video coding standard SVC that is based on a base layer and enhancement layers. In such scheme a decoder decodes the base layer, generates the output frame, upscales this frame to the resolution of the enhancement layer so that it can be used for further decoding of an enhancement layer. The thus upscaled frame of the base layer is used as a reference frame in the decoding of an enhancement layer frame in a second loop, resulting in the reconstruction of the high resolution frame. Due to the decoding dependency of the enhancement layer, a delay is introduced in the decoding scheme that scales with the amount of enhancement layers.
US2015/0103886 describes an example of a video coding system that is based on SHVC. This design has the same disadvantage in the sense that due to the decoding dependencies between the base layer and enhancement layers multiple decoding loops (2 or more when there are more than one enhancement layers) have to be sequentially processed. As a result, even if a parallelisation is implemented by decoding the base layer and the enhancement layers in two different processes, both processes operate with a delay of at least one frame, or even multiple frames depending of the coding hierarchy. This results in delays when a user wants to switch to a higher resolution, or when tuning in a broadcast stream where decoder needs to first decode the base layer and then the enhancement layers.
Additionally, in SHVC different resolution versions of the original high resolution video signal are generated on the basis of a sequence of downsampling steps for downsampling the high resolution video signal to different low resolution versions. Similarly, when reconstructing the original high resolution video signal, a number of sequential upsampling steps are required that scales with the number of enhancement layers that needs to be added to the base layer. After each upsampling step the buffer occupancy is increased. Hence, for each layer a different buffer sizes need to be defined and dependencies between layers required up or down sampling of the resolution.
More generally, a multi-loop video coding design such as SHVC introduces a high implementation complexity and high memory consumption since the decoder needs to store the decoded frames in memory (as long as they are needed for decoding dependant frames from enhancement layers). The complexity makes codec designs such as SHVC make less attractive for fast development of a hardware implementation that is required for industry acceptance.
Hence, from the above it follows that there is a need in the art for improved spatial scalable coding schemes that reduces complexity and/or delays in both encoding and decoding side. In particular, there is a need in the art for improved spatial scalable coding schemes that allow a high-level of parallelization of the decoder operations.