Video transmission over a network such as the Internet has become commonplace. Video conferencing, in particular, is increasingly replacing face-to-face conferencing as a way to avoid the cost and inconvenience associated with travel. While video conferencing provides a closer approximation of in-person meetings than, for example, telephone conferencing, it requires relatively high bandwidth and computing power for optimum video quality.
Unfortunately, not all users of video conferencing and other forms of network-based video transmission have high-speed network connections, and the connection speeds of participants in the same conference may vary. Similarly, the image processing power of each user's computer is typically not the same. These and other factors contribute to video conference participants having varying degrees of video processing capabilities. Simply reducing video quality to the “lowest common denominator” for all users is not an optimal solution because it needlessly reduces the image quality at otherwise capable endpoints.
One solution that has been used to transmit bitstreams of varying quality is transcoding. This involves decoding and re-encoding video to achieve a target spatial resolution and/or frame rate. However, transcoding is associated with high computational requirements as well as the subsequent introduction of latency.
Scalable video coding (SVC) has been used to overcome this problem. In multi-point video calls in which participating endpoints require different encoded bit rates, spatial resolutions, and/or frame rates, a single SVC bitstream containing multiple spatial and/or temporal layers has been used to trans-rate the single multi-layer bitstream into bitstreams with different spatial resolutions and/or frame rates. SVC has also been used to improve error resilience by sending multiple copies of images in a single bitstream.
In addition to avoiding the high computational requirements and latency associated with transcoding, SVC does not suffer video quality loss as does transcoding. While the scalable extension of H.264 (H.264 SVC) has been shown to solve the above-mentioned problems when used with an appropriately designed system, a video source encoded using H.264 SVC is generally not interoperable with the H.264 Baseline profile. Consequently, a device encoding video using H.264 SVC cannot directly interoperate with non-SVC capable Baseline profile conformant video communication devices. Unfortunately, the Baseline profile is the most widely deployed in video conferencing: several million non-SVC Baseline devices have been deployed since 2003 when version 1 of the H.264 standard describing the Baseline profile was published. Similar interoperability problems exist between H.264 SVC video and H.264 High profile devices.
The most commonly deployed SVC profile in video conferencing is the H.264 SVC Scalable Baseline profile. While a limited form of interoperability is possible between this profile and Baseline profiles as a result of the fact that the base layer of an H.264 SVC Scalable Baseline encoded bitstream is conformant with the Baseline profile, to achieve a full range of interoperability, a transcoding MCU or gateway is used. This is also the case for interoperability between the H.264 SVC Scalable High and High profiles. Unfortunately, transcoding MCUs and gateways decode and re-encode video and therefore suffer from the above-mentioned disadvantages associated with transcoding. In addition, transcoding MCUs and gateways are more complex and more costly than simpler non-transcoding routing devices. It would therefore be beneficial to produce spatially scalable H.264 Baseline profile conformant bitstreams for interoperability with H.264 Baseline profile devices.