The past few years have witnessed a great popularity of digital and online videos and their applications. With the emergence of fast communication technologies and multimedia applications, digital video codecs are used in many areas and systems, such as in DVDs (Digital Video Disc) employing the MPEG-2 (Moving Picture Experts Group-2) format, in VCDs 2 (Video Compact Disc) employing the MPEG-1 (Moving Picture Experts Group-1) format, in emerging satellite and terrestrial broadcast systems, and on the Internet.
More specifically, this popularity of video applications allowed for interesting developments in video codecs, which compress and decompress video data. In video data compression, a balance is kept between the video quality and the compression rate, i.e. the necessary transmitted quantity of data, in other words, the bitrate needed to represent a video.
In addition, the complexity of encoding and decoding algorithms, robustness to data losses and errors, the state of the art of compression algorithm design, end-to-end delay in a videoconference application for example, etc., are also considered.
A plurality of video coding standards exist, each of them is specially designed for a particular type of application. For example, the H.263 standard, published by the ITU (International Telecommunications Union) is a video coding and compression standard for low bitrates, such as in the range of 40-128 kbps (kilobits per second). More specifically, this standard supports video coding in video-conferencing and video-telephony applications.
The H.263 standard specifies the format and content of the encoded stream of data, therefore, it sets the requirements for the encoder and decoder to meet, without specifically providing a design or structure of an encoder and decoder themselves. Similar principles apply to other video standards such as MPEG-4.
In video compression, each picture is represented by typically two kinds of pictures, commonly referred to as frames, i.e. the Intra frames and Inter frames. Furthermore, the Inter frames are separated into two categories, i.e. the P-frames (Predictive frames) and B-frames (Bi-predictive or Bi-directional frames). The Intra frames represent a whole picture, therefore they are bandwidth consuming since the content of the whole picture must be encoded. In order to compress and therefore save bandwidth, only differences between whole pictures (or Intra frames) are encoded and then transmitted. Those differences are represented by the P-frames and the B-frames. For example, the background between two consecutive pictures usually do not change, therefore, the background does not need to be encoded again. The B-frames are bi-directional and thus perform a bi-directional prediction, i.e. a prediction with the previous and next pictures.
Furthermore, when compressing videos, a picture is divided into macroblocks for processing purposes. Indeed, processing is applied macroblock by macroblock. Each macroblock generally represents a block of 16 by 16 pixels.
A video encoder generally includes a motion estimation module, a motion compensation module, a DCT (Discrete Cosine Transforms) module, and a quantizing module.
The motion estimation module allows for predicting which areas of a previous frame have been moved into the current frame so that those areas do not need to be re-encoded.
The motion compensation module allows for compensating for the movement of the areas from the previous frame into the current frame.
DCT are generally used for transforming a block of pixels into “spatial frequency coefficients”. They operate on a two-dimensional block of pixels, such as a macroblock. Since DCT are efficient at compacting the energy (or information) of pictures, generally a few DCT coefficients are sufficient for recreating the original picture.
Also, a quantizing module is provided for quantizing the DCT coefficients. For example, the quantizing module sets the near zero DCT coefficients to zero and quantizes the remaining non-zero DCT coefficients.
One of the limitations in video coding comes from the capacity of a channel. Indeed, communication channels are limited by the number of bits that they can transmit per second. In many channels, the bitrate is constant, such as in ISDN (Integrated Services Digital Networks), POTS (Plain Old Telephone Service), wireless channels, etc.
However, depending on the efficiency of the algorithms used to compress the videos and the motion complexity of those videos, the bit budget and the bitrate needed for encoding and transmitting the encoded videos may vary or increase. Therefore a rate control is used to adjust the bitrate required for encoding videos of various complexity to the bitrate of the channel used to transmit those encoded videos.
The current rate control algorithm used in the H.263 standard is called the TMN8 (Test Model Near-Term version 8). Generally stated, this rate control algorithm ensures that only an average bitrate is met.
The paper entitled “Rate Control in DCT Video Coding for Low-Delay Communications”, by Jordi Ribas-Corbera, 1999, hereinafter referred to as Reference 1, discloses an algorithm used by the rate control TMN8 to ensure that the target average bitrate, related to a target frame size, is met by each frame. More specifically, the TMN8 rate control algorithm computes some image statistics to determine some proper QP (Quantization Parameter) values for each macroblock and update them within each Inter frame so as to meet the target frame size. Unfortunately, this control is very approximate and often the resulting frame size can be significantly over or under the target frame size. For Intra frames, a fixed QP is used for the whole video sequence regardless of the characteristics of the video sequence. Having no control over the size of Intra frames is generally a factor leading to exceeding the desired bitrate.
Furthermore, the rate control TMN8 cannot control both an average target bitrate and a maximum bitrate. Indeed, the TMN8 rate control algorithm used in the H.263 video coding standard only uses an average bitrate parameter. However, in many video applications, a maximum bitrate should also be considered in addition to the average bitrate.
TMN8 cannot guarantee to not exceed a given target bitrate, since the encoder has no control over the Intra frame sizes and no sufficient control over the Inter frame sizes. When the given target bitrate is exceeded, the encoder will skip a certain number of frames so as to compensate for the overflow. However, by so doing, the quality of the communication and the videos is altered.
Therefore, there is a need for overcoming the above discussed problems, related to the limitations of the current rate control in video coding standards, such as the H.263 standard. Accordingly, a method and system for improving the rate control in video coding standard are sought.