The Motion Picture Experts Group (“MPEG”) has defined a standard bitstream syntax (the MPEG standard) for the coded representation of video. However, the MPEG standard allows for certain flexibility in the design of video encoders which may, for example, optimize performance by adding sophistication and/or make certain compromises between improving image quality and conserving a low bit rate.
One element of the MPEG bit stream syntax is the quantization step size (“Q”). In typical video coding, the quality of the image and the bit rate of the coded video are inversely proportional to Q. A higher quantization factor uses fewer bits to encode the image, however the resulting image quality may suffer as a result. A lower quantization requires more bits to encode a given video scene, and a result produces a higher quality image. In some implementations (e.g., MPEG) the quantization values can differ for individual image blocks within a frame.
Conventional methods for selecting the values of Q include uniform quantization and adaptive quantization. Uniform quantization uses the same (or nearly the same) Q for each block within a frame. As a result, quantization noise and coding artifacts caused by the compression of data are uniformly distributed throughout the frame, regardless of the activity levels within each frame. In contrast, adaptive quantization permits the variation of Q among different sectors or blocks within a frame so that the quantization noise can be distributed among blocks in a frame in a non-uniform manner. The goal of adaptive quantization is to optimize the visual quality of each video scene and from scene to scene, while maintaining a predefined bit rate. For example, since the human eye is less sensitive to quantization noise and coding artifacts in busy or highly textured parts of images, a higher Q may be used for busy regions of the scene. Conversely, for low-textured scenes, a lower Q is used to improve video quality for that particular scene, by the cost of a higher bit rate.
Although the MPEG standard allows for adaptive quantization, particular implementations of adaptive quantization are not prescribed in the MPEG standard. MPEG2 test model 5 (“TM5”) is one example of an adaptive quantization technique for improving subjective visual quality according to metrics such as spatial frequency response and visual masking response.
One technique for measuring image quality is to apply aspects of the human visual system (“HVS”) to video scenes. For example, human sensitivity to quantization noise and coding artifacts is less in areas of a video scene having very high or very low brightness (contrast sensitivity). In busy image areas (e.g., areas of high texture, large contrast, and/or signal variance), the sensitivity of the HVS to distortion decreases because the quantization noise and coding artifacts are lost in complex patterns. However, in images with low variation, human sensitivity to contrast and distortion increases.
The artifacts that occur when pictures are coded at low bit rates are blockiness, blurriness, ringing, and color bleeding. In moving video scenes, these artifacts show as run-time busyness and as dirty uncovered backgrounds. The local variance of a video signal is often noticeable to the HVS on a very small scale, such as from pixel to pixel or among groups of blocks of pixels (referred to herein as “blocks”). As a result, the quantization step size (“Q”) may be calculated for each block or other small subunit of area (“sector”) within each video frame. Accordingly, the quantization step size is typically proportional to a measurement of activity within each block or sector.
In Variable Bit Rate (VBR) and Constant Bit Rate (CBR) video encoding it is desirable to maintain the quality of the image throughout the frame and stream even as the activity varies from frame to frame. Because the quality of the image is tightly coupled to the quantization value Q used during encoding, it is therefore desirable to control quantization in a manner that provides uniform video quality. Thus, there is a need for techniques that select encoding parameters based on different activity levels and quality measures. Such techniques may be more suitable in certain implementations, such as increasing or maintaining image quality while allowing for a simplified encoder design, and would allow for an optimal selection of the activity metric used during the encoding process, as it directly effects the Human Visual System (HVS).