In video coding standards, a bit stream is compliant if it can be decoded, at least conceptually, by a mathematical model of a decoder that is connected to the output of an encoder. Such a model decoder is known as the hypothetical reference decoder (HRD) in the H.263 coding standard, and the video buffering verifier (VBV) in the MPEG coding standard. In general, a real decoder device (or terminal) comprises a decoder buffer, a decoder, and a display unit. If a real decoder device is constructed according to the mathematical model of the decoder, and a compliant bit stream is transmitted to the device under specific conditions, then the decoder buffer will not overflow or underflow, and decoding will be performed correctly.
Previous reference (model) decoders assume that a bit stream will be transmitted through a channel at a given constant bit rate, and will be decoded (after a given buffering delay) by a device having some given buffer size. Therefore, these models are quite inflexible and do not address the requirements of many of today's important video applications such as broadcasting live video, or streaming pre-encoded video on demand over network paths with various peak bit rates, to devices with various buffer sizes.
In previous reference decoders, the video bit stream is received at a given constant bit rate, (usually the average rate in bits per second of the stream), and is stored in the decoder buffer until the buffer reaches some desired level of fullness. For example, at least the data corresponding to one initial frame of video information is needed before decoding can reconstruct an output frame therefrom. This desired level is denoted as the initial decoder buffer fullness, and at a constant bit rate is directly proportional to the transmission or start-up (buffer) delay. Once this fullness is reached, the decoder instantaneously (in essence) removes the bits for the first video frame of the sequence, and decodes the bits to display the frame. The bits for the following frames are also removed, decoded, and displayed instantaneously at subsequent time intervals.
Such a reference decoder operates at a fixed bit rate, buffer size, and initial delay. However, in many contemporary video applications, (e.g., video streaming through the Internet or ATM networks), the peak bandwidth varies according to the network path. For example, the peak bandwidth differs based on whether the connection to the network is by modem, ISDN, DSL, cable and so forth. Moreover, the peak bandwidth may also fluctuate in time according to network conditions, e.g., based on network congestion, the number of users connected, and other known factors. Still further, the video bit streams are delivered to a variety of devices with different buffer capabilities, including hand-sets, Personal Digital Assistants (PDAs), PCs, pocket-sized computing devices, television set-top boxes, DVD-like players, and the like, and are created for scenarios with different delay requirements, e.g., low-delay streaming, progressive download, and the like.
Existing reference decoders do not adjust for such variables. At the same time, encoders typically do not and cannot know in advance what the variable conditions will be for a given recipient. As a result, resources and/or delay time are often wasted unnecessarily, or are unsuitable in many instances.