Quantization Vs. Perceptual Quality
In video processing, quantization is a lossy compression technique achieved by compressing a range of values to a single quantum value. When a video frame is quantized in any system, information is lost. For example, typical video encoders (e.g., MPEG-2, MPEG-4, H.264, VC-1, etc.) can compress video streams by discarding information that is intended to be perceptually the least significant—information without which the decoded (decompressed) video can still closely resemble the original video. The difference between the original and the decompressed video resulting from quantization is sometimes referred to as “quantization noise.” The amount of information discarded during encoding depends on how the video stream is quantized. Each video compression format defines a discrete set of quantization settings, and each quantization setting has an abstract identifier, denoted as a quantization parameter (QP). The QP can be arbitrarily defined as, for example, an integer that indexes an array of quantization settings such that quantization noise introduced by a smaller QP value of X is less than the quantization noise introduced by a larger QP value of X+1. The quantization settings indexed by a given QP value can be different for each video codec.
If too much information is discarded during quantization, the video stream may appear distorted when it is decompressed during playback. This captures the relationship between quantization and perceptual quality. Thus, the QP may be used as an indicator of perceptual quality since the QP indicates how much information is discarded when encoding a video stream. In particular, when the QP value is smaller, more information is retained. As the QP value is increased, however, more information is discarded because some of the information is aggregated so that the bit rate drops, which results in some loss of quality and some increase in distortion.
Quantization Vs. Bitrate
In video processing, bitrate refers to a number of bits used per unit of playback time to represent a continuous video after encoding (data compression). Different sections of a video stream can naturally require a different number of bits to be represented even when they consist of the same number of pixels and are encoded with the same QP. A given section of a video stream quantized with a higher QP value of X+1, however, will never require more bits than the same section quantized with a smaller QP value of X, assuming all other encoding parameters are held constant and assuming a higher QP value represents a coarser quantization (more information loss). In practice, this means that average bitrate requirement decreases when the QP is increased. FIG. 4 shows how the bitrate of an encoded video stream decreases as the QP increases. The bitrate numbers and the QP values in FIG. 4 are just examples. In practice, the numbers and the values can be different and correlation between QP values and bitrate can vary for different video sequences.
The value of QP can be dynamically changed throughout the video stream by the video encoder. For example, each frame within the video stream can be assigned its own QP value, and that value will be used to quantize all pixels within that frame. Frames assigned higher QP values would therefore undergo coarser quantization and result in fewer encoded bits than similar frames quantized with lower QP values. Changing the QP value at a frame level is used by some video encoders, for example, for maintaining an average bitrate of the encoded stream at a relatively constant level—when the bitrate starts to exceed a predefined level, QP value(s) for subsequent frames(s) can be increased, and vice versa, when the bitrate falls below the predefined level, QP value(s) for subsequent frame(s) can be decreased.
Frame-level QP modification, however, does not take into account the fact that the visual information included in a given frame is rarely equally distributed across the frame. More typically, a video frame would have some “flat” regions with relatively constant color and brightness (e.g., sky, grass, water, walls) while other regions would be have more details (e.g., a person's face, a text, or any other object characterized with abrupt color and/or brightness changes). Regions characterized by different levels of detail or pixel variance may also differ significantly in terms of compressibility, that is, to what degree they can be compressed without significant degradation of perceptual quality.