New digital video and audio “scalable” coding techniques, which are directed to general improvements in coding efficiency, have a number of new structural characteristics. Specifically, an important new characteristic is scalability. In scalable coding, an original or source signal is represented using two or more hierarchically structured bitstreams. The hierarchical structure implies that decoding of a given bitstream depends on the availability of some or all other bitstreams that are lower in hierarchy. Each bitstream, together with the bitstreams it depends on, offer a representation of the original signal at a particular temporal, fidelity (e.g., in terms of signal-to-noise ratio (SNR)), or spatial resolution (for video).
It is understood that term ‘scalable’ does not refer to magnitude or scale in terms of numbers, but rather to the ability of the encoding technique to offer a set of different bitstreams corresponding to efficient representations of the original or source signal at different ‘scales’ of resolutions or other qualities in general. The forthcoming ITU-T H.264 Annex F specification, which is referred to as Scalable Video Coding (SVC)), is an example of a video coding standard that offers video coding scalability in all of temporal, spatial, and fidelity dimensions. SVC is an extension of the H.264 standard (also known as Advanced Video Coding (AVC)). An example of an earlier standard, which also offered all three types of scalability, is ISO MPEG-2 (also published as ITU-T H.262). ITU G.729.1 (also known as G.729EV) is an example of a standard offering scalable audio coding.
Scalability was introduced in video and audio coding as a solution to distribution problems in streaming and broadcasting, and with a view to allow a given communication system to operate with varying access networks (e.g., clients connected with different bandwidths), network conditions (e.g., bandwidth fluctuation), and client devices (e.g., a personal computer that uses a large monitor vs. a handheld device with a much smaller screen).
Scalable video coding techniques, which are specifically designed for interactive video communication applications such as videoconferencing, are described in commonly assigned International patent application PCT/US06/028365. Further, commonly assigned International patent application PCT/US06/028365 describes the design of a new type of server, called Scalable Video Communication Server (SVCS). SVCS can advantageously use scalable coded video for high-quality and low-delay video communication and has a complexity, which is significantly reduced compared to traditional switching or transcoding Multipoint Control Units (MCUs). Similarly, commonly assigned International patent application PCT/US06/62569 describes a Compositing Scalable Video Coding Server (CSVCS), which has the same benefits as an SVCS but produces a single coded output bit stream. The scalable video coding design and the SVCS/CSVCS architecture can be used in further advantageous ways, which are described, for example, in commonly assigned International patent applications PCT/US06/028367, PCT/US06/027368, PCT/US06/061815, PCT/JUS07/62357, and PCT/US07/63335. These applications describe use of scalable coding techniques and SVCS/CVCS architecture for effective trunking between servers, reduced jitter buffer delay, error resilience and random access, “thinning” of scalable video bitstreams to improve coding efficiency with reduced packet loss, and rate control, respectively. Further, commonly assigned U.S. Provisional Patent Application Ser. No. 60/786,997 described techniques for transcoding between scalable video coding formats and other formats, whereas commonly assigned U.S. Provisional Patent Application Ser. No. 60/884,148 describes further improvements in error resilience in video communication systems that use scalable video coding.
Consideration is now being given to improved video and audio communication systems that use scalable video or audio coding. In particular, with a view of improving such systems, attention is directed toward managing the scalability information communicated from a source of a video or audio bit stream to a recipient, either directly or through one or more servers. The source may be a transmitting endpoint that encodes and transmits live video over a communication network, a streaming server that transmits pre-coded video, or a software module that provides access to a file stored in a mass storage or other access device. Similarly, the recipient may be a receiving endpoint that obtains the coded video or audio bit stream over a communication network, or directly from a mass storage or other access device.