Most modern video transmission systems using the Internet and mobile networks use IP (Internet Protocol) for real-time services, such as conversational and streaming services. Typically, IP networks are comprised of a wide range of connection qualities and receiving devices. The receiving devices typically have a variety of capabilities ranging from cell phones with small screens and restricted processing power to high-end PCs (personal computers) with high definition displays.
To accommodate the varying connection qualities and receiving devices, encoders are typically introduced into the network. An encoder receives the raw video and encodes it into one or more non-scalable formats where each format represents a different quality of video. The higher the quality, the more bandwidth the video requires. Conversely, the lower the quality, the less bandwidth the video requires. The encoded videos are then saved on a media server or on a farm of media servers.
When a media client requests a particular video with a particular quality the appropriate encoded video stream is retrieved from the media server and transmitted to the media client over the IP network. In this manner each media client can receive a different quality of video stream suitable for its needs. For example, a cell phone media client with a small screen is not capable of displaying a high resolution video, therefore it is not worthwhile transmitting a high quality video to the cell phone media client. On the other hand, a high-end PC media client with a high definition display would likely want to receive the highest resolution video possible.
However, encoder-based systems are not designed to dynamically adapt to system changes. Once a non-scalable format is selected for a particular media client, the scalable media format is used for the entire transmission of the video, regardless of any changes in the system. For example, if a user initially requested a high quality video stream, and subsequently the network bandwidth between the media server and the media client is reduced, the media server will not reduce the video quality to accommodate. The media server will continue to send the high quality video which results in a loss of packets and ultimately quality. In addition, encoder based systems do provide the end user with dynamic control of the streaming. For example, the user has no ability to dynamically change the resolution or screen size of the video.
Furthermore, encoder-based systems require the media server or farm of media servers to have sufficient memory to store each video stream in a variety of non-scalable formats.
One way to solve the problems with an encoder-based system is to introduce transcoders in the network between the media server and the media client. The transcoders receive the encoded video stream from the media server and convert the encoded video stream into another non-scalable format based on a variety of factors including the condition of the network between the transcoder and the media client. The conversion typically involves decoding of the original video stream from the media server, and recoding of the decoded video stream using other parameters.
However, not only are transcoders generally quite expensive on a per port basis, but the conversion usually causes a time delay and degradation of the video quality.
To address at least some of the problems with the previous video transmission systems, a new video coding standard, referred to as Scalable Video Coding (SVC) was developed. SVC is an extension of the H.264/MPEG-4 AVC video compression standard. When a raw video stream is SVC encoded, it is encoded into one or more streams or layers, of differing quality. The layer with the lowest quality, referred to as the base layer, contains the most important part of the video stream. One or more enhancement layers may then be encoded to further refine the quality of the base layer. The enhancement layers are used for improving the spatial resolution (picture size), temporal resolution (frame rate), and the SNR (signal to noise ratio) quality of the base layer.