In many video coding systems, pictures that have been decoded earlier can be used for prediction of the image data of later pictures so that only the difference needs to be encoded. As known in the art, this prediction greatly reduces the size of the coded data. The order that pictures are coded or decoded needs not be the same as the pictures are output from the decoder. A picture order count (POC) may be coded into a bitstream and used in decoding to establish an output order of pictures as well as to adapt certain decoding processes, such as motion vector scaling and weights for weighted prediction. Furthermore, reference pictures may be identified through their POC value for example in reference picture set syntax structure, which identifies the reference pictures that may be used for the current picture or subsequent pictures as reference for inter prediction. As POC values may be used for identifying pictures for example in a reference picture set syntax structure, they should be sufficiently robustly coded into the bitstream for each picture so that accidental data losses e.g. due to packet losses during transmission or intentional removal of pictures, such as removal of a temporal scalability layer, do not affect the decoding operation of the remaining pictures. Consequently, POC values should be coded for each picture with a relatively large number of bits.
Many video coding systems include the possibility for temporal scalability. In other words, a subset of a coded video bitstream may be formed by excluding coded pictures, where the subset bitstream provides a lower picture rate than the original bitstream. Temporal scalability can be used for example for bitrate adaptation in transmission systems and so-called trick modes, e.g. fast forward play. Pictures in a temporally scalable video bitstream are typically organized in layers and the layer identifier, such as temporal_id in the H.264/AVC coding standard, is included in the bitstream. Temporal scalability can then be realized by including only certain layers into the subset bitstream. Thus, temporal scalability conventionally provides a fairly granular level of scalability.
There is, therefore, a need for solutions that improve the reference picture handling process without undermining coding efficiency, improve compression for picture order count values, and provide more flexible signaling for temporal structure and scalability of video bitstreams.