The demand for digital video products continues to increase. Some examples of applications for digital video include video communication, security and surveillance, industrial automation, and entertainment (e.g., DV, HDTV, satellite TV, set-top boxes, Internet video streaming, digital cameras, video jukeboxes, high-end displays and personal video recorders). Further, video applications are becoming increasingly mobile as a result of higher computation power in handsets, advances in battery technology, and high-speed wireless connectivity.
Video compression is an essential enabler for digital video products. Compression-decompression (CODEC) algorithms enable storage and transmission of digital video. Typically codecs are industry standards such as MPEG-2, MPEG-4, H.264/AVC, etc. At the core of all of these standards is the hybrid video coding technique of block motion compensation (prediction) plus transform coding of prediction error. Block motion compensation is used to remove temporal redundancy between successive pictures (frames or fields) by prediction from prior pictures, whereas transform coding is used to remove spatial redundancy within each block.
Many block motion compensation schemes basically assume that between successive pictures, i.e., frames, in a video sequence, an object in a scene undergoes a displacement in the x- and y-directions and these displacements define the components of a motion vector. Thus, an object in one picture can be predicted from the object in a prior picture by using the motion vector of the object. To track visual differences from frame-to-frame, each frame is tiled into blocks often referred to as macroblocks. Block-based motion estimation algorithms are used to generate a set of vectors to describe block motion flow between frames, thereby constructing a motion-compensated prediction of a frame. The vectors are determined using block-matching procedures that try to identify the most similar blocks in the current frame with those that have already been encoded in prior frames.
Many video codecs (e.g., H.264 video codecs) select from among a variety of coding modes to encode video data as efficiently as possible. In many instances, the best compression mode for a macroblock is determined by selecting the mode with the best compression performance, i.e., with the minimum rate-distortion (R-D) cost:Cost=DistortionMode+λ*RateMode.  (1)where λ is the Lagrangian multiplier, RateMode is the bit-rate of a mode, and DistortionMode is the distortion (loss of image quality) for a mode. An accurate R-D cost may be obtained by actually coding a macroblock in all the modes and using information from the coding process to determine the distortion and bit-rate. For example, to determine the bit-rate of a macroblock encoded using a particular mode, the transform of the data in the macroblock is taken, the transformed data is quantized, and then the quantized data is entropy coded find the bit rate. However, determination of bit-rates in this manner is computationally complex and may not be suitable for use in real-time video applications with low-power encoders and limited computation resources such as cellular telephones, video cameras, etc.
To reduce the complexity of determining the bit-rate, techniques for estimating the bit-rate are used. Some known techniques are based on the direct correlation between the spatial information of the data in a macroblock, which is fairly easy to extract, and the actual number of bits required to compress the data. In general, in these techniques, the spatial information of the data and actual bit-rate of the data is modeled by fitting curves in an offline training stage for various quantization parameters and video contents. Finding a one-to-one mapping that yields the bit-rate of the data for the given spatial information may be difficult. Further, even if such a curve is approximated, the curve is dependent on the content of training data which may hinder the generalization of the extracted relationship between the bit-rate and the spatial information to actual data. Other known bit-rate estimation techniques rely on taking the transform of the data in a macroblock and counting the number of non-zero coefficients in the transform domain after applying dead-zone quantization. However, in some applications, even taking the transform and counting the number of non-zero coefficients after quantization can be computationally costly. Accordingly, improvements in bit-rate estimation and rate-distortion cost estimation that further reduce the computational complexity are desirable for real-time, low-power video applications.