Scalable video coding refers to coding structure where one bitstream can contain multiple representations of the content at different bitrates, resolutions or frame rates. In these cases the receiver can extract the desired representation depending on its characteristics. Alternatively, a server or a network element can extract the portions of the bitstream to be transmitted to the receiver depending on e.g. the network characteristics or processing capabilities of the receiver. A scalable bitstream typically consists of a base layer providing the lowest quality video available and one or more enhancement layers that enhance the video quality when received and decoded together with the lower layers. In order to improve coding efficiency for the enhancement layers, the coded representation of that layer typically depends on the lower layers.
The bitstream format of H.264/AVC or H.265/HEVC does not include an indication of an end of an access unit. Consequently, the end of an access unit may have to be concluded based on the detection of the start of the next access unit. In low-latency applications, data from which the start of the next access unit can be concluded may be received significantly later, e.g. after one picture delay.
In the multi-layer H.265/HEVC extensions, such as SHVC and MV-HEVC, it is not required to include a picture unit in each layer consistently in each access unit. In other words, there may be a picture unit at layer A in one access unit but in another access unit no picture unit at layer A may be present. It is therefore not possible to conclude from the layer identifier values of a picture unit whether it is the last picture unit of an access unit.
A further problem arises from the fact that a multi-layer bitstream may be subject to layer extraction in the sender and/or in one or more gateways or alike. An indication of an end of an access unit should be resilient to layer extractions so that decoders can conclude an end of access unit reliably even if the bitstream has been subject to layer extraction. Particularly, if the highest layer(s) of the bitstream are extracted out from the bitstream, decoders should still have means to conclude an end of an access unit.