H.264, also referred to as Moving Picture Experts Group-4 (MPEG-4) Advanced Video Coding (AVC), is the state of the art video coding standard. It consists of a block based hybrid video coding scheme that exploits temporal and spatial prediction.
High Efficiency Video Coding (HEVC) is a new video coding standard currently being developed in Joint Collaborative Team—Video Coding (JCT-VC). JCT-VC is a collaborative project between MPEG and International Telecommunication Union Telecommunication standardization sector (ITU-T). Currently, a Working Draft (WD) is defined that includes large macroblocks (abbreviated LCUs for Largest Coding Units) and a number of other new tools and is considerably more efficient than H.264/AVC.
At a receiver a decoder receives a bit stream representing pictures, i.e. video data packets of compressed data. The compressed data comprises payload and control information. The control information comprises e.g. information of which reference pictures should be stored in a decoded picture buffer (DPB), also referred to as a reference picture buffer. This information is a relative reference to previous received pictures. Further, the decoder decodes the received bit stream and displays the decoded picture. In addition, the decoded pictures are stored in the decoded picture buffer according to the control information. These stored reference pictures are used by the decoder when decoding subsequent pictures.
A working assumption for the processes of decoded picture buffer operations in the working draft of HEVC is that they will be inherited from H.264/AVC to a very large extent. A simplified flow chart of the scheme as it is designed in H.264/AVC is shown in FIG. 1.
Before the actual decoding of a picture, the frame_num in the slice header is parsed to detect a possible gap in frame_num if Sequence Parameter Set (SPS) syntax element gaps_in_frame_num_value_allowed_flag is 1. The frame_num indicates the decoding order. If a gap in frame_num is detected, “non-existing” frames are created and inserted into the decoded picture buffer.
Regardless of whether there was a gap in frame_num or not the next step is the actual decoding of the current picture. If the slice headers of the picture contain Memory Management Control Operations (MMCO) commands, adaptive memory control process is applied after decoding of the picture to obtain relative reference to the pictures to be stored in the decoded picture buffer; otherwise a sliding window process is applied to obtain relative reference to the pictures to be stored in the decoded picture buffer. As a final step, the “bumping” process is applied to deliver the pictures in correct order.
A problem with H.264/AVC is its vulnerability to losses of pictures that contains MMCO of type 2, 3, 4, 5 or 6 as described in Table 1 below.
TABLE 1Memory management control operation values for H.264/AVCmemory_management_control_operationMemory Management Control Operation0End memory_management_control_operation syntaxelement loop1Mark a short-term reference picture as “unused forreference”2Mark a long-term reference picture as “unused forreference”3Mark a short-term reference picture as “used for long-term reference” and assign a long-term frame index to it4Specify the maximum long-term frame index and mark alllong-term reference pictures having long-term frameindices greater than the maximum value as “unused forreference”5Mark all reference pictures as “unused for reference” andset the MaxLongTermFrameIdx variable to “no long-termframe indices”6Mark the current picture as “used for long-termreference” and assign a long-term frame index to it
Loss of a picture that does not contain MMCO, or a picture that contains MMCO of type 0 or 1, is of course severe to the decoding process. Pixel values of the lost picture will not be available and may affect future pictures for a long period of time due to incorrect inter prediction. There is also a risk that reference picture lists for a few pictures following the lost picture will be wrong, for example if the lost picture contained MMCO that marked one short-term reference picture as “unused for reference” that otherwise would have been included in the reference picture list of the following picture. However, the decoding process can generally recover such a loss through usage of constrained intra blocks, intra slices or by other means.
But if a picture containing MMCO of type 2, 3, 4, 5 or 6 is lost there is a risk that the number of long term pictures in the DPB is different from what it would have been if the picture was received, resulting in an “incorrect” sliding window process for all the following pictures. That is, the encoder and decoder will contain a different number of short-term pictures resulting in out-of-sync behavior of the sliding window process. This loss cannot be recovered through usage of constrained intra blocks, intra slices or similar techniques (not even an open Group Of Picture (GOP) Intra picture). The only way to ensure recovery from such a loss is through an Instantaneous Decoder Refresh (IDR) picture or through an MMCO that cancels the effect of the lost MMCO. What makes the situation even worse is that a decoder will not necessarily know that the sliding window process is out-of-sync and thus cannot report the problem to the encoder or request an IDR picture even in applications where a feedback channel is available.
One way to reduce the risk of loosing important MMCO information is to use dec_ref_pic_marking_repetition Supplementary Enhancement Information (SEI) messages. However the encoder will not know if the decoder is capable of making use of dec_ref_pic_marking_repetition SEI messages. Further, there is a risk that the dec_ref_pic_marking_repetition SEI message is also lost.
There is, thus, a need for an efficient reference picture signaling and buffer management that do not suffer from the shortcomings and limitations of prior art solutions.