The present invention relates to video coding systems and, in particular, to management of synchronization between encoder and decoders in the video coding systems.
In video coding systems, a video encoder may code a source video sequence into a coded representation that has a smaller bit rate than does the source video and, thereby may achieve data compression. The encoder may code processed video data according to any of a variety of different coding techniques to achieve compression. One common technique for data compression uses predictive coding techniques (e.g., temporal/motion predictive coding). For example, some frames in a video stream may be coded independently (I-frames) and some other frames (e.g., P-frames or B-frames) may be coded differentially using other frames as prediction references. P-frames may be coded with reference to a single previously coded frame (called, a “reference frame”) and B-frames may be coded with reference to a pair of previously-coded reference frames. The resulting compressed sequence (bit stream) may be transmitted to a decoder via a channel. To recover the video data, the bit stream may be decompressed at the decoder by inverting the coding processes performed by the coder, yielding a recovered video sequence.
Such coding schemes require that the encoder and decoder synchronize their operations to properly associate coded frames with the reference frames on which they rely. In most existing video coding standards, each coded frame is assigned an index to indicate its display order. In H.264 and the emerging HEVC standard, ITU-T document JCTVC-J1003_d7 (herein, “HEVC”), such an index is called the “picture order count” (POC) and it is signaled in the slice header of every slice. POC is also used for other purposes. For example, in HEVC, POC is proposed for use as a frame identifier in the reference picture buffer.
In order to reduce the overhead of signaling POC, coding protocols usually define that the least significant bits (LSB) of the POC are to be signaled in the bit stream. The decoder then uses the following logic to determine the most significant bits (MSB) of the POC.
TABLE 1if( ( pic_order_cnt_lsb < prevPicOrderCntLsb ) &&( ( prevPicOrderCntLsb − pic_order_cnt_lsb ) >=( MaxPicOrderCntLsb/2 ) ) )PicOrderCntMsb = prevPicOrderCntMsb + MaxPicOrderCntLsbelse if( (pic_order_cnt_lsb > prevPicOrderCntLsb ) &&( (pic_order_cnt_lsb − prevPicOrderCntLsb ) >( MaxPicOrderCntLsb/2 ) ) )PicOrderCntMsb = prevPicOrderCntMsb − MaxPicOrderCntLsbelsePicOrderCntMsb = prevPicOrderCntMsbUnder this scheme, a decoder may detect whether the difference of the POC LSBs of a current coded frame and the previously-coded frame is more than half of the maximal POC LSB value, in order to determine the MSB.
The above method works well when the decoder correctly receives all the frames from the encoder. In video communication over error-prone networks, however, video frames can be lost. If more than (MaxPicOrderCntLsb/2) number of consecutive frames are lost during transmission, the above logic will no longer work correctly and can result in a mismatch in POC values between the encoder and the decoder. Modern coding environments include error-prone networks, such as wireless networks, where intermittent network connectivity losses can arise, which lead to loss of significant portions of a coded video sequence. Accordingly, the inventors have identified a need for a coding protocol that provides for reliable synchronization of picture order counts between an encoder and a decoder even when communication errors cause losses of coded video data.