1. Field
The present invention relates generally to video processing, and more specifically, to methods and systems for providing efficient control of bit rate and quality in video encoders.
2. Background
The development of an efficient encoder control mechanism is a key issue in video coding. The goal of a typical encoder control mechanism is to generate compressed streams at the given target bit rates with minimum distortion.
Internationally adopted video coding standards (such as, MPEG1, MPEG2, MPEG4, H.263, and H.264 etc.) are based on the hybrid coding architecture. These standards reduce the temporal and spatial redundancies in the video data through motion compensated predictions and transformations, and employ entropy-coding techniques to achieve high compression ratios. The normative specifications of video coding standards provide the bit-stream syntax and the video decoding process to enable interoperability. Apart from the interoperability aspects, the video encoding process in general falls outside the scope of the standardization process. In this respect, the encoder control algorithm and the motion estimation process are the two most important components where the designers of video encoders have the flexibility to apply their ingenuity and come up with low cost and efficient solutions. The coding efficiency and computational load of a video encoder is primarily dependent upon the effectiveness of its operational control algorithm and its motion estimation strategy.
Most video encoding applications require video sequences to be encoded at a prescribed rate with minimum possible distortion. The rate-distortion efficiency of a video encoder largely depends upon its operational control algorithm. The control algorithm has to adjust the numerous coding parameters in a video encoder so as to maximize its coding efficiency, without violating the bit rate limits. The control algorithm is responsible for, amongst other things, dynamically selecting the optimum quantization parameters, picture types, pixel block modes, and pixel block partitions. Problems concerning the control algorithm are made complex by the intricate interaction between the widely varying content and motion in typical video sequences, and by the spatial and temporal dependencies between the different coding parameters. These problems are further compounded by the non-linear sensitivity of the human visual system (HVS) to distortions of different types.
The issue of encoder control became more significant with the arrival of the new H.264/AVC (ISO/IEC 14496-10) video coding standard, since such standard offers much more coding options compared with the previous standards. The H.264/AVC standard delivers much higher compression efficiency compared to the earlier standards. However, this higher compression efficiency comes at the cost of much higher computational complexity. The encoder has to select between numerous Inter and Intra macroblock prediction modes to obtain the optimum encoding mode. Such selection is a critical and time-consuming step, and the impressive bit rate reduction of H.264/AVC largely depends on it.
In the H.264/AVC reference encoder software, the selection of the optimum encoding mode is done by an algorithm known as rate-distortion optimization. The basic idea behind such algorithm is to minimize the distortion (D) subject to a constant rate (R), or to minimize R subject to a fixed D. Rate distortion optimization solves this problem by introducing a Lagrange multiplier λ to convert the constrained optimization problem into an unconstrained optimization problem, and minimizing the Lagrangian function D+λR. An ideal full scale optimization search for each picture and coding mode would be prohibitively large and resource intensive. In practice, preliminary experiments are performed using a large number of Lagrange multiplier values to determine approximate relationships between λ and D for different fixed quantization parameters (Q). A set of rate-distortion curves with one curve for each Q is thus obtained. The slope of these curves at a particular (R, D) point determines the value of λ. The optimum rate-distortion relationship is obtained by taking the minimum of the rate-distortion curves and, in turn, generating an approximate relationship between λ and Q.
A practical rate-distortion optimization process then uses the approximate relationship between λ and Q to select λ and involves an exhaustive calculation of all feasible modes to determine the bits and distortion of each mode. The process then evaluates a Lagrangian metric (D+λR) that considers both bit rate and distortion, and selects the mode that minimizes this metric. The resulting bit rate R may violate the limits on desired bit rate, thus, necessitating the use of a rate controller. The usual way to control the bit rate is to vary Q from pixel block to pixel block and, in the case of buffer overflow or underflow, to increase or decrease Q accordingly.
Despite the above simplifications, the exhaustive selection of all feasible modes in rate-distortion optimization presents a major hurdle in the implementation of H.264/AVC compliant encoders, particularly in real-time load constrained environments. This fact is very significant in consumer electronics where the success of a system depends largely on its cost competitiveness, and where DSPs (digital signal processors) and other devices having low or limited computing power are frequently used. In addition to this, the rate-distortion optimization algorithm of the H.264/AVC reference software does not consider the behavior of HVS to distortions of different types. This is an important omission as more emphasis needs to be given to distortions that are easily detected by human eye and less emphasis to those that are not easy to perceive. By distributing the available bits to different parts of the image judiciously and by taking into consideration the behavior of HVS, higher quality encoded video can be generated.
Hence, it would be desirable to provide methods and systems that are capable of providing, amongst other things, a low cost and efficient operational encoder control structure that can be deployed in H.264/AVC based systems.