The present invention is concerned with low delay coding of pictures
In the current HEVC design Slices, Entropy Slices (former Light Weight Slices) Tiles and WPP (Wavefront Parallel Processing) are contained as tools for parallelization.
For parallelization of video encoders and decoders picture-level partitioning has several advantages compared to other approaches. In previous video codecs, like H.264/AVC [1], picture partitions were only possible with regular slices with a high cost in terms of coding efficiency. For scalable parallel H.264/AVC decoding it is necessitated to combine macroblock-level parallelism for picture reconstruction and frame-level parallelism for entropy decoding. This approach, however, provides limited reduction in picture latencies and high memory usage. In order to overcome these limitations, new picture partition strategies have been included in the HEVC codec. Current reference software version (HM-6) contains 4 different approaches: regular or normal slices, entropy slices, wavefront parallel processing (WPP) sub-streams and tiles. Typically those picture partitions comprise a set of Largest Coding Units (LCUs), or, in a synonymous wording, Coding Tree Units (CTU), as defined in HEVC or even a subset of those.
FIG. 1 shows as a picture 898 exemplarily positioned into regular slice 900 per row 902 of LCUs or macroblocks in a picture. Regular or normal slices (as defined in H.264 [1]) have the largest coding penalty as they break entropy decoding and prediction dependencies.
Entropy slices, like slices, break entropy decoding dependencies but allow prediction (and filtering) to cross slice boundaries.
In WPP the picture partitions are row interleaved, and both entropy decoding and prediction are allowed to use data from blocks in other partitions. In this way coding losses are minimized while at the same time wavefront parallelism can be exploited. The interleaving, however, violates bitstream causality as a prior partition needs a next partition to decode.
FIG. 2 exemplarily shows a picture 898 divided up into two rows 904, 904b of horizontally partitioning tiles 906. Tiles define horizontal 908 and vertical boundaries 910 that partition a picture 898 into tile columns 912a,b,c and rows 904a,b. Similar to regular slices 900, tiles 906 break entropy decoding and prediction dependencies, but does not necessitate a header for each tile.
For each of these techniques the number of partitions can be freely chosen by the encoder. In general having more partitions leads to higher compression losses. However in WPP the loss propagation is not so high and therefore the number of picture partitions even can be fixed to one per row. This leads also to several advantages. First, for WPP bitstream causality is guaranteed. Second, decoder implementations can assume that a certain amount of parallelism is available, which also increases with the resolution. And, finally, none of the context selection and prediction dependencies have to be broken when decoding in wavefront order, resulting in relative low coding losses.
However, until now all parallel coding in transform concepts fail to provide an achievement of high compression efficiency in combination with keeping the delay low. This is also true for the WPP concept. The slices are the smallest units of transportation, in the coding pipeline, and several WPP substreams still have to be transported serially.