The invention relates to frame-layer rate control for video encoders, and more specifically, to video encoding methods and systems with frame-layer rate control.
A wide range of new applications in visual communications as well as rapidly evolving telecommunication and computer technology have led to the development of various video coding standards. Some video coding standards, for example, MPEG-1, MPEG-2, and MPEG-4, are designed for non-conversational applications such as storage, streaming, and broadcasting. Other video coding standards, for example, H.261 and H.263, are designed for conversational applications such as video telephony and conferencing. Video coding standards typically comprise building blocks including discrete cosine transform (DCT), motion estimation (ME) or motion compensation (MC), quantization, and variable length coding (VLC) . The quantizer step-size used for a frame or a macroblock (MB) impacts the encoded video quality, and an appropriate rate control algorithm should be utilized to determine the quantizer step-size for a given application and coding environment. Thus, rate control has been studied extensively.
Rate control algorithms can be generally classified into two categories, single-pass and multi-pass, according to the number of encoding passes of video sequences. Single-pass rate control algorithms are implemented while the given video sequence is encoded only once, and are used for real-time encoding applications where future frames are not available and long encoding delay is not permitted. Exemplary single-pass algorithms are described in MPEG-2 Test Model Test Model 5 (TM5) Doc., Test Model Editing Committee, ISO/IEC JTC1/SC29/WGl1/93-255b, Apr. 1993; C. Crecos and J. Jiang, “On-line improvement of the rate-distortion performance in MPEG-2 rate control,” IEEE Trans. Circuits Syst. Video Technol., pp. 519-528, June 2003; T. Chiang and Y. -Q. Zhang, “A new rate control scheme using a new rate-distortion model,” IEEE Trans. Circuits Syst. Video Technol., pp. 246-250, Feb. 1997; F. Pan, Z. Li, K. Lim, and G. Feng, “A study of MPEG-4 rate control scheme and its improvements,” IEEE Trans. Circuits Syst. Video Technol., pp. 440-446, May 2003; J. Ribas-Corbera and S. Lei, “Rate control in DCT video coding for low-delay communications,” IEEE Trans. Circuits Syst. Video Technol., pp. 172-185, Feb. 1999; and Z. He, Y. K. Kim, and S. K. Mitra, “Low-delay rate control for DCT video coding via rho-domain source modeling,” IEEE Trans. Circuits Syst. Video Technol., pp. 928-940, Aug. 2001.
Multi-pass rate control algorithms encode the given video sequence several times to optimally determine the quantizer step-size, and are typically utilized in applications where encoding delay is not a concern or the encoding procedure can be executed offline, for example, video streaming and storage. Exemplary multi-pass algorithms are described in A. Ortega, K. Ramchandran, and M. Vetterli, “Optimal trellis-based buffered compression and fast approximation,” IEEE Trans. Image Processing, pp. 26-40, Jan. 1994; K. Ramchandran, A. Ortega, and M. Vetterli, “Bit allocation for dependent quantization with applications to multi-resolution and MPEG video coders,” IEEE Trans. Image Processing, pp. 533-545, Sept. 1994; L. J. Lin and A. Ortega, “Bit-rate control using piecewise approximated rate-distortion characteristics,” IEEE Trans. Circuits Syst. Video Technol., pp. 446-459, Aug. 1998; and W. Ding and E. Liu, “Rate control of MPEG video coding and recording by rate-quantization modeling,” IEEE Trans. Circuits Syst. Video Technol., pp. 12-20, Feb. 1996.
Rate control algorithms can also be classified into three categories according to the quantizer step-size determination methods, including direct buffer-state feedback methods, model-based analytical methods, and operational rate-distortion (R-D) modeling methods. Direct buffer-state feedback methods determine the quantizer step-size based on the level of buffer fullness and activity. Model-based analytical methods utilize several rate and distortion models for rate control, for example, rate is modeled as a quadratic function of the quantizer step-size. In general, direct buffer-state feedback and model-based analytical methods fall into the category of single-pass rate control algorithms. Operational R-D modeling methods may use dynamic programming and Lagrange optimization methods to determine quantizer step-size of frames within a group of pictures (GOP) . Excessive computational complexity prevents most operational R-D modeling methods from being used for real-time rate control. Model-based operational R-D modeling has been proposed to reduce computational complexity, where models are used to predict the R-D characteristics of input video sequence from a limited number of control points at the expense of accuracy. These methods, however, are still computationally burdensome for real-time encoding and require encoding delay to some extent.
H.264 is a video standard for both conversational and non-conversational applications, which achieves a significant coding gain over other coding standards by introducing various new coding techniques such as intra prediction, various block shapes and multiple reference frames for inter prediction with R-D optimized motion estimation and mode decision, which will be called RDO hereafter.
Several single-pass rate control algorithms are proposed for H.264 encoding. An exemplary partial two-pass algorithm for an H.264 encoder proposed by S. Ma, W. Gao, P. Gao, and Y. Lu in “Rate control for advance video coding (AVC) standard,” in Proc. Int. Conference, Circuits Syst., pp. 25-28, May 2003, provides MB-layer rate control based on the MPEG-2 TM5 rate control algorithm. Given the target bit-rate for the frame, a quantizer parameter (QP) of the previous MB is used to perform RDO for the current MB so that the previous QP serves as an estimated QP for the current MB. A new QP for the current MB is decided based on the level of buffer fullness and the MB activity after RDO. The residual signal of the current MB is quantized with the original estimated QP if the difference between the previous QP and the new QP is below a threshold. Otherwise, RDO is performed again for the current MB using the new QP and residual signal is also quantized with the new QP.
A single-pass frame-layer rate control algorithm for an H.264 encoder base on the quadratic rate model has been proposed by Z. G. Li, F.Pan, K. P. Lim, G. N. Feng, X. Lin, S. Rahardja, and D. J. Wu in “Adaptive frame layer rate control for H.264,” in Proc. Int. Conference, Multimedia Expo, pp. 581-584, June 2003. Since residual signal is not available before RDO due to the interdependency between RDO and rate control, a linear model has been introduced to predict the mean absolute difference (MAD) of the residual signal of the current frame and the previous frame. The linear model, however, cannot estimate the MAD of the current frame precisely, particularly for sequences where the amount of motion varies significantly from frame to frame.
Many rate control algorithms developed for H.264 fail to address the problems caused by interdependency between RDO and rate control as the residual signal is not yet available before determining the new QP for the MB or frame. Another problem is that an importance of the number of header bits in the rate control for H.264 encoders is disregarded. H.264 is different from other video coding standards since intra prediction and multiple reference frames with variable block sizes are used for motion estimation and mode decision. Since this kind of information should be encoded, the number of header bits takes a large portion out of the total number of bits, and it varies frame by frame and MB by MB. Therefore, accurate information on the number of header bits is required in rate control algorithms for H.264 because it cannot be estimated using the level of buffer capacity and rate model.