Video playback capabilities vary widely among device classes. At one end of the spectrum are TV displays that render 120 high definition frames per second. On the low end are mobile phones that may only be capable of showing 15 low resolution frames per second. In addition to the limitations of the rendering device, the bandwidth of the network used to disseminate the content may also dictate a ceiling for frame rate and image resolution. For example, the very best residential U.S. broadband connections, such as Verizon's FIOS, can easily carry compressed streams for high quality, 60 frames per second high definition video. Many mobile operators still have 2.5G or slower data networks that permit just 15 frames per second (fps) with only 176×144 pixels (QCIF) per frame.
Rather than having a separate compressed stream for every situation, it is often preferred to include some extra information in the same stream so that each display system can decode the proper level of quality for its capabilities. By way of example, a mobile handset might be able to find 20 fps QCIF while a PC reading the same stream might decode some extra information and then be able to produce 30 fps of 720×480 progressive (480p) data.
Bitstreams which have these extra layers are said to be scalable and the general concept is referred to as scalability. It is desirable to keep the overhead for the extra layers to a minimum. Ideally there should be no more bits than the maximum quality content requires and in fact, as shall be shown later, it is possible to have the size of the scalable bitstream very close to what the lower quality requires without much reduction to the fidelity of the highest quality content.
In current practice, it is fortunate that the so-called H.264 encoding technology can be used in a variety of situations. However, the standard does not in itself support scalability in a sufficiently broad manner. Thus, one of the challenges for scalable systems is to make sure that standard H.264 decoders will be able to play the stream at some level of quality, even if the higher quality levels are only available to enhanced decoders. This requirement of standards compliance can also be applied to any other particular standard, including legacy encoders like MPEG-2 and future encoders not yet specified.
In addition to the bandwidth constraint and display constraint, processing power is also a significant factor. Mobile devices like cellular phones are inherently limited in their computing ability by battery, size and cost. Also, most television systems (including set top boxes) are designed as embedded systems with a lot of dedicated hardware and only a minimal amount of programmable processing capability. Therefore any scalable bitstream must either be supported in hardware, which takes several years of lead time, or must require a low amount of computing.
Taking the constraints of bandwidth, processing power, display capability and standards compliance into account is a challenging problem which has not been satisfactorily addressed in the prior art.