New digital video and audio ‘scalable’ coding techniques, which aim to generally improve coding efficiency, have a number of new structural characteristics (e.g., scalability). In scalable coding, an original or source signal is represented using two or more hierarchically structured bitstreams. The hierarchical structure implies that decoding of a given bitstream depends on the availability of some or all other bitstreams that are lower in hierarchy. Each bitstream, together with the bitstreams it depends on, offer a representation of the original signal at a particular temporal, fidelity (e.g., in terms of signal-to-noise ratio (SNR)), or spatial resolution (for video).
It is understood that term ‘scalable’ does not refer to a numerical magnitude or scale, but refers to the ability of the encoding technique to offer a set of different bitstreams corresponding to efficient representations of the original or source signal at different ‘scales’ of resolutions or other signal qualities. The ITU-T H.264 Annex G specification, which is referred to as Scalable Video Coding (SVC), is an example of a video coding standard that offers video coding scalability in all of temporal, spatial, and fidelity dimensions. SVC is an extension of the H.264 standard (also known as Advanced Video Coding or AVC). An example of an earlier standard, which also offered all three types of scalability, is ISO MPEG-2 (also published as ITU-T H.262). ITU G.729.1 (also known as G.729EV) is an example of a standard offering scalable audio coding.
The concept of scalability was introduced in video and audio coding as a solution to distribution problems in streaming and broadcasting, and to allow a given communication system to operate with varying access networks (e.g., clients connected with different bandwidths), under varying network conditions (e.g., bandwidth fluctuations), and with various client devices (e.g., a personal computer that uses a large monitor vs. a handheld device with a much smaller screen).
Scalable video coding techniques, which are specifically designed for interactive video communication applications such as videoconferencing, are described in commonly assigned International patent application PCT/US06/028365. Further, commonly assigned International patent application PCT/US06/028365 describes the design of a new type of server called the Scalable Video Communication Server (SVCS). SVCS can advantageously use scalable coded video for high-quality and low-delay video communication and has a complexity, which is significantly reduced compared to traditional switching or transcoding Multipoint Control Units (MCUs). Similarly, commonly assigned International patent application PCT/US06/62569 describes a Compositing Scalable Video Coding Server (CSVCS), which has the same benefits as an SVCS but produces a single coded output bit stream. Furthermore, International patent application PCT/US07/80089 describes a Multicast Scalable Video Coding Server (MSVCS), which has the same benefits as an SVCS but utilizes available multicast communication channels. The scalable video coding design and the SVCS/CSVCS architecture can be used in further advantageous ways, which are described, for example, in commonly assigned International patent applications PCT/US06/028367, PCT/US06/027368, PCT/US06/061815, PCT/US07/62357, and PCT/US07/63335. These applications describe the use of scalable coding techniques and SVCS/CVCS architecture for effective trunking between servers, reduced jitter buffer delay, error resilience and random access, “thinning” of scalable video bitstreams to improve coding efficiency with reduced packet loss, and rate control, respectively. Further, commonly assigned International patent application PCT/US07/65554 describes techniques for transcoding between scalable video coding formats and other formats.
Consideration is now being given to further improving video communication systems that use scalable video coding. In such systems, a source may be a transmitting endpoint that encodes and transmits live video over a communication network, a streaming server that transmits pre-coded video, or a software module that provides access to a file stored in a mass storage or other access device. Similarly, a receiver may be a receiving endpoint that obtains the coded video or audio bit stream over a communication network, or directly from a mass storage or other access device. An intermediate processing entity in the system may be an SVCS or a CSVCS. Attention is being directed toward improving the efficiency of switching between temporal layers by receivers and intermediate processing entities.