The present disclosure relates to predictive video coders and, in particular, to techniques for reliable initiation of video coding sessions in the face of unpredictable network bandwidth and/or channel errors.
Predictive video coding systems exploit spatial and temporal redundancy in video content by coding image content of one frame with reference to image content of other previously-coded frames. For example, when a new frame is coded as a “P frame,” the new frame relies on a single previously-coded frame (called a “reference frame”) as a basis of prediction. Alternatively, when a new frame is coded as a “B frame,” the new frame relies on a pair of previously-coded reference frames as a basis of prediction. Coding new frames predictively conserves bandwidth because the new frame can be coded with fewer bits than would be required to code the new frame without such prediction references.
The reference frames are members of a common coding session as the frames being coded. Thus, a given frame may be coded predictively with respect to a first set of reference frames and may be transmitted from an encoder to a decoder. The same frame may itself be designated as a reference frame, in which case the frame may serve as a prediction reference for other frames that are coded later. An encoder and decoder both locally store copies of decoded reference frames for use in predictive coding.
Reference frames are not available within a video coding system when a new coding session is first established. Conventionally, many coding protocols require a first frame to be coded as an instantaneous decoder refresh (“IDR”) frame. A frame coded as an IDR frame typically requires a greater number of bits to code than the same frame coded as a P or B frame. Moreover, an IDR frame resets decoder states, which forces the coding system to code all subsequently-processed frames predictively with respect to the IDR frame or to another frame that depends from the IDR frame.
Modern video coders communicate over networks that have variable bandwidth available to carry coded data and that are susceptible to communication errors. The unpredictable nature of the networks can cause significant latency in the speed at which coding systems can establish new coding sessions. If an encoder codes and transmits a first IDR frame that is lost during transmission, the decoder cannot decode the IDR frame successfully. An encoder may not be notified of the failed transmission until after it has coded (and perhaps transmitted) many other frames using the IDR frame as a basis of prediction. If the encoder attempts to code another frame as an IDR frame, the second IDR frame will reset the decoder and the decoder will not be able to make use of any frames that were coded prior to the second IDR frame and received successfully. The encoder and decoder may exchange a significant amount of coded video data that cannot be used by a decoder until the decoder receives and successfully decodes a first IDR frame. Thus, there may be a perceptible latency to the rendering of video data after initialization of a video coding session, which may be perceived as annoying to operators of coding systems.