This section is intended to provide a background or context to the invention that is recited in the claims. The description herein may include concepts that could be pursued, but are not necessarily ones that have been previously conceived or pursued. Therefore, unless otherwise indicated herein, what is described in this section is not prior art to the description and claims in this application and is not admitted to be prior art by inclusion in this section.
Scalable video coding refers to coding structure where one bitstream can contain multiple representations of the content at different bitrates, resolutions or frame rates. In these cases the receiver can extract the desired representation depending on its characteristics. Alternatively, a server or a network element can extract the portions of the bitstream to be transmitted to the receiver depending on e.g. the network characteristics or processing capabilities of the receiver. A scalable bitstream typically consists of a base layer providing the lowest quality video available and one or more enhancement layers that enhance the video quality when received and decoded together with the lower layers. In order to improve coding efficiency for the enhancement layers, the coded representation of that layer typically depends on the lower layers.
In order to support a client switching between different qualities and resolutions during a streaming session, encoded random access point pictures at the segment boundaries may be utilized. Conventionally, only instantaneous random access point (RAP) pictures, like the instantaneous decoding refresh (IDR) picture, that start a so-called closed group of pictures (GOP) prediction structure have been used at segment boundaries of dynamic adaptive streaming over HTTP (DASH) representations. The use of intra pictures starting open GOPs, e.g. clean random access (CRA) pictures in H.265/HEVC, has been improved in H.265/HEVC when compared to older standards, as a decoding process starting from a CRA picture has been normatively specified. When the decoding starts from a CRA picture, some pictures, referred to random access skipped leading (RASL) pictures, following the CRA picture in decoding order but preceding the CRA picture in output order may not be decodable. Consequently, if open GOPs were used at segment boundaries in DASH, representation switching might result into the inability to decode the RASL pictures and hence a picture rate glitch in the playback. For example, if a prediction hierarchy of 8 pictures were used and the picture rate were 25 Hz, the video would be frozen for about one third of a second.