The present invention is directed to methods and apparatuses for digitally encoding a video signal using an adaptive quantization technique that optimizes perceptual video quality while conserving bits. The present invention could be applied to any video coding application in which quantization can be modified within a video frame.
The MPEG Standard
The Motion Picture Experts Group (xe2x80x9cMPEGxe2x80x9d) has standardized a syntax for the coded representation of video. Only the bit stream syntax for decoding is specified. This leaves flexibility for designing encoders, which may optimize performance by adding sophistication. The MPEG standard also allows for compromise between optimizing image quality and conserving a low bit rate.
The MPEG video bit stream syntax provides a tool called the quantization parameter (xe2x80x9cQPxe2x80x9d) for modulating the step size of the quantizer, or data compressor. In typical video coding, the quality and bit rate of the coded video are determined by the value of the QP selected by the encoder. Coarser quantization encodes a given video scene using fewer bits but reduces image quality. Finer quantization uses more bits to encode a given video scene, with the goal of increasing image quality. Often, the quantization values can be modified within a video frame. For example, in MPEG (1, 2, 4) and H.263, there is a QP for each 16xc3x9716 image block (or macroblock) of the video scene.
Human Visual System As a Factor for Achieving Subjective Image Quality
Early digital image compression techniques sought to transmit an image at the lowest possible bit rate and yet reconstruct the image with a minimum loss of perceived quality. These early attempts used information theory to minimize the mean squared error (xe2x80x9cMMSExe2x80x9d). But the human eye does not perceive quality in the mean squared error sense, and the classical coding theory of MMSE did not necessarily yield results pleasing to the human eye. Further, classical MMSE theory applied to the human enjoyment of moving video scenes did not yield pleasing results.
For certain wavelengths, the human eye can see a single photon of light in a dark room. This sensitivity of the human visual system (xe2x80x9cHVSxe2x80x9d) also applies to quantization noise and coding artifacts within video scenes. The sensitivity of the HVS changes from one part of a video image to another. For example, human sensitivity to quantization noise and coding artifacts is less in the very bright and very dark areas of a video scene (contrast sensitivity). In busy image areas containing high texture or having large contrast or signal variance, the sensitivity of the HVS to distortion decreases. In these busy areas, the quantization noise and coding artifacts get lost in complex patterns. This is known as a masking effect. In smooth parts of an image with low variation, human sensitivity to contrast and distortion increases. For instance, a single fleck of pepper is immediately noticeable and out of place in a container of salt. Likewise, a single nonfunctioning pixel in a video monitor may be noticeable and annoying if located in a visually uniform area in the center of the monitor""s working area, but hardly noticeable at all if lost in the variegated toolbars near the edges.
The objectionable artifacts that occur when pictures are coded at low bit rates are blockiness, blurriness, ringing, and color bleeding. Blockiness is the artifact related to the appearance of the 8xc3x978 discrete cosine transform grid caused by coarse quantization in low-detail areas. This sometimes causes pixelation of straight lines. Blurriness is the result of loss of spatial detail in medium-textured and high-textured areas. Ringing and color bleeding occur at edges on flat backgrounds where high frequencies are poorly quantized. Color bleeding is specific to strong chrominance edges. In moving video scenes, these artifacts show as run-time busy-ness and as dirty uncovered backgrounds. Significant artifacts among frames can result in run-time flicker if they are repetitive.
The local variance of a video signal is often noticeable to the HVS on a very small scale: from pixel to pixel or from macroblock to macroblock. This means that ideally the quantization step size should be calculated for each macroblock or other small subunit of area (xe2x80x9csectorxe2x80x9d) in a video frame. Accordingly, the quantization step size should be directly proportional to variance or some other measure of activity in each macroblock or sector.
Adaptive Versus Uniform Quantization
Previously, inventors have used the following two approaches for selecting the values of the QPs: uniform quantization and adaptive quantization.
The uniform quantization method chooses the same (or nearly the same) QP for all the macroblocks in a frame. As a result, quantization noise and coding artifacts caused by the compression of data are uniformly distributed throughout the frame.
The adaptive quantization approach permits different sectors in a video scene to be coded with varying degrees of data compression and therefore varying degrees of fidelity. This approach varies the value of the QP so that the quantization noise is distributed according to at least one property of the HVS. The goal of adaptive quantization is to optimize the visual quality of each video scene and the visual quality from video scene to video scene, while conserving storage bits by keeping the bit rate low. For example, since the human eye is less sensitive to quantization noise and coding artifacts in busy or highly textured sectors, the QP can be increased, resulting in coarser quantization and a lower bit rate requirement in busy regions. Since the human eye is more sensitive to quantization noise and coding artifacts in flat or low-textured sectors, the QP may be decreased to maintain or improve video quality, resulting in finer quantization but a higher bit rate requirement.
Although the MPEG standard allows for adaptive quantization, algorithms containing rules for the use of adaptive quantization to improve visual quality are not prescribed in the MPEG standard. As a result, two encoders may use completely different adaptive quantization algorithms and each still produce valid MPEG bit streams. MPEG2 test model 5 (xe2x80x9cTM5xe2x80x9d) is one such adaptive quantization approach that seeks to provide an improved subjective visual quality according to characteristics of the HVS, such as spatial frequency response and visual masking response.
A common problem with some adaptive quantization approaches is that, although they may improve the visual quality in some regions of a video scene, they may also reduce the quality in others. For example, if the number of extra bits needed to refine the detail in some regions of a video scene is fairly high, the number of allotted bits for the remaining regions can be too small, and the quantization noise and coding artifacts in the latter can become quite noticeable and annoying.
Additionally, some macroblocks may contain smooth textures that are difficult to encode because they are poorly predicted, while others may contain highly textured regions that are well predicted and easy to encode. Known methods do not take this into account when adapting the QP.
Description of the Prior Art
Exemplary previous methods that attempt to adapt quantization for each macroblock so that the visual quality perceived by the HVS is uniform throughout the frame are described by the following nonpatent references: xe2x80x9cMotion-Compensated Video Coding With Adaptive Perceptual Quantization,xe2x80x9d by Puri and R. Aravind; xe2x80x9cAdaptive Quantization Scheme For MPEG Video Coders Based on HVS (Human Visual System),xe2x80x9d by Sultan and H. A. Latchman; xe2x80x9cClassified Perceptual Coding With Adaptive Quantization,xe2x80x9d by S. H. Tan, K. K. Pang, and K. N. Ngan; and xe2x80x9cA Simple Adaptive Quantization Algorithm For Video Coding,xe2x80x9d by N. I. Choo, H. Lee, and S. U. Lee. The methods described by these references each suffer from at least one drawback. All the methods in the above references classify macroblocks according to texture content, but do not take into account the effect of prediction accuracy on bit rate. Some macroblocks are predicted accurately and require few bits to be encoded, but others of similar texture are not predicted accurately and may require many bits to be encoded. This variability in the bit requirements for similarly textured macroblocks should be one factor in calculating the magnitude of the QP. These methods fail to economize the bit cost and in some sectors of a video scene waste bits without significantly improving video quality. Further, some of the methods are not appropriate for one-pass video coding. And several of the methods use uncommon means for measuring the texture of macroblocks. This complicates the design of hardware encoders and is difficult to implement in programmable LSI chips.
Patent references directed to adaptive quantization do not describe a satisfactory method that saves bit cost and is easy to implement.
U.S. Pat. No. 4,710,812 to Murakami et al., entitled xe2x80x9cInterframe Adaptive Vector Quantization Encoding Apparatus and Video Encoding Transmission Apparatus,xe2x80x9d and U.S. Pat. No. 5,861,923 to Yoon, entitled xe2x80x9cVideo Signal Encoding Method and Apparatus Based on Adaptive Quantization Technique,xe2x80x9d for example, do not take into account the number of bits required by each class of macroblock. The methods in these two references can easily produce drops in image quality, for example, by reducing the quantization step size in flat macroblocks. If there are no high-textured macroblocks and there are many flat macroblocks, the flat macroblocks will consume many bits, and then few bits will be left over for the medium-textured macroblocks, thereby producing noticeable quantization noise and coding artifacts.
U.S. Pat. No. 5,481,309 to Juri et al., entitled xe2x80x9cVideo Signal Bit Rate Reduction Apparatus Having Adaptive Quantization,xe2x80x9d and U.S. Pat. No. 5,231,484 to Gonzales et al., entitled xe2x80x9cMotion Video Compression System With Adaptive Bit Allocation and Quantization,xe2x80x9d both suggest adapting the quantization step size for sectors of a video scene that are less sensitive to the human eye. But in these two references, video quality can be lost because the methods do not adapt the QP based on prediction error energy.
U.S. Pat. No. 5,990,957 to Ryoo, entitled xe2x80x9cVideo Signal Bit Amount Control Using Adaptive Quantization,xe2x80x9d is directed to a method in which the QP is adapted according to some aspects of human visual sensitivity. But the technique requires a pre-analysis and is not suitable for one-pass encoders.
The present invention is directed to a one-pass method for digitally encoding a video signal using an adaptive quantization technique that optimizes perceptual video quality while conserving bits. The present invention could be applied to any video coding application in which quantization can be modified within a video frame.
The present invention encodes a video frame by increasing quantization in sectors of the video frame where quantization noise and coding artifacts are less noticeable to the human visual system and decreases quantization in sectors where quantization noise and coding artifacts are more noticeable to the human visual system. Surplus bits obtained from increasing quantization are preferably used to perform the step of decreasing quantization in flat sectors. In a preferred embodiment, uniform quantization is maintained if increasing quantization and decreasing quantization would require more bits than maintaining the uniform quantization.
In another variation, the present invention predicts whether there are sufficient busy sectors to make adaptive quantization of a particular video frame effective by determining whether the number of bits that would be required to encode the flat sectors using a decreased quantization parameter could be supplied by the predicted surplus bits provided by encoding all busy sectors of the video frame using an increased quantization parameter.
The foregoing and other objectives, features, and advantages of the invention will be more readily understood upon consideration of the following detailed description of the invention, taken in conjunction with the accompanying drawings.