Real-time streaming of multimedia content over data networks, including the Internet, has become an increasingly common application in recent years. A wide range of interactive and non-interactive multimedia applications, such as news-on-demand, live network television viewing, video conferencing, among others, rely on end-to-end streaming video techniques. Unlike a “downloaded” video file, which may be retrieved first in “non-real” time and viewed or played back later in “real” time, streaming video applications require a video transmitter that encodes and transmits a video signal over a data network to a video receiver, which must decode and display the video signal in real time.
Scalable video coding is a desirable feature for many multimedia applications and services that are used in systems employing decoders with a wide range of processing power. Scalability allows processors with low computational power to decode only a subset of the scalable video stream. Another use of scalable video is in environments with a variable transmission bandwidth. In those environments, receivers with low-access bandwidth receive, and consequently decode, only a subset of the scalable video stream, where the amount of that subset is proportional to the available bandwidth.
Several video scalability approaches have been adopted by lead video compression standards, such as MPEG-2 and MPEG-4. Temporal, spatial and quality (e.g., signal-noise ratio (SNR)) scalability types have been defined in these standards. All of these approaches consist of a base layer (BL) and an enhancement layer (EL). The base layer part of the scalable video stream represents, in general, the minimum amount of data needed for decoding that stream. The enhanced layer part of the stream represents additional information, and therefore enhances the video signal representation when decoded by the receiver.
For example, in a variable bandwidth system, such as the Internet, the base layer transmission rate may be established at the minimum guaranteed transmission rate of the variable bandwidth system. Hence, if a subscriber has a minimum guaranteed bandwidth of 256 kbps, the base layer rate may be established at 256 kbps also. If the actual available bandwidth is 384 kbps, the extra 128 kbps of bandwidth may be used by the enhancement layer to improve on the basic signal transmitted at the base layer rate.
For each type of video scalability, a certain scalability structure is identified. The scalability structure defines the relationship among the pictures of the base layer and the pictures of the enhanced layer. One class of scalability is fine-granular scalability. Images coded with this type of scalability can be decoded progressively. In other words, the decoder may decode and display the image with only a subset of the data used for coding that image. As more data is received, the quality of the decoded image is progressively enhanced until the complete information is received, decoded, and displayed.
The proposed MPEG-4 standard is directed to video streaming applications based on very low bit rate coding, such as video-phone, mobile multimedia/audio-visual communications, multimedia e-mail, remote sensing, interactive games, and the like. Within the MPEG-4 standard, fine-granular scalability (FGS) has been recognized as an essential technique for networked video distribution. FGS primarily targets applications where video is streamed over heterogeneous networks in real-time. It provides bandwidth adaptivity by encoding content once for a range of bit rates, and enabling the video transmission server to change the transmission rate dynamically without in-depth knowledge or parsing of the video bit stream.
During the decoding of scalable video, such as MPEG 2 or MPEG 4 video, the activity of the central processing unit (CPU) that decodes the video bit stream can vary widely over time. The CPU load varies because the decompression process depends on source type (video or film), video content (level of motion, level of detail), and frame type (I, B, P). A film source generally requires a greater amount of processing power than an original video source due to the larger size and greater aspect ratio of film. An I-frame (or image frame), which contains the entire bit image of a single frame of video, generally requires the greatest amount of processing power. A P-frame (or predicted frame), which contains the differences between the current frame and the next frame, generally requires the least amount of processing power. A B-frame (or bidirectional frame), which tracks all of the changes since the previous I frame, generally requires greater processing power than a P-frame but less than an I-frame.
If a single CPU is executing a software scalable video decoder, such a software MPEG decoder, in real time concurrently with other programs, such as a digital signal processing application, then the wide variations in CPU load caused by the video decoding operation can detrimentally impact the performance of the CPU. The set of concurrent programs executed by the CPU must be time-multiplexed while meeting the real-time requirements of the scalable video decoding operation. The scheduling of the individual programs (or tasks) is generally under the control of a real-time operating system. However, task scheduling becomes a very complex operation when the CPU load (i.e., CPU cycles, memory bandwidth, and the like) of the tasks varies significantly with time. Task scheduling typically reserves for each task the CPU resources necessary for the peak requirements of each task. This leads to inefficiencies when one of the tasks is a video decoding operation, because of the large differences between the peak requirements and the average requirements of the video decoding operation. This difference is simply wasted during periods ore relatively low CPU activity.
Therefore, there is a need in the art for improved decoders and decoding techniques for use in streaming video systems. In particular, there is a need for systems and methods for balancing the load on a central processing unit during the decoding of a scalable video signal. More particularly, there is a need for systems and methods capable of dynamically adjusting or varying the level of decoding of a scalable video signal, such as a scalable MPEG signal, in order to reduce the variations in the CPU load.