The present invention is directed to methods and systems for processing digital image signals and more specifically to post-processing methods and systems that reduce quantization noise and coding artifacts in decoded digital images and videos.
An image or video sequence may be converted to electronic signals and stored or transmitted digitally. Encoding an image usually includes estimating approximate visual attributes for small blocks of the image. Encoding also usually includes a data compression (“quantization”) scheme for efficiently storing and transmitting a minimum number of bits. The encoding approximations and quantization may degrade the visual quality of the decoded image so a compromise is usually sought between image quality and bit requirements.
The Joint Photographical Expert Group (“JPEG”) and the Motion Picture Experts Group (“MPEG”) have established standards for the digital representation of images and videos. In the MPEG syntax, for instance, only the bit stream syntax for decoding is specified. This leaves flexibility for the encoder design, which may use standard quantization schemes. An external processor may be used to reduce quantization noise and coding artifacts from a digital signal decompressed by a decoder (“post-processing”). The post-processor is typically a filter that refines and improves displayed images without altering existing quantization schemes, which may be designed into the hardware.
Post-processing systems may now be integrated into new designs for systems that use new quantization standards. Existing quantization schemes, however, may be enhanced by using post-processing filters that can be added to existing systems.
Early digital image compression techniques sought to transmit an image at the lowest possible bit rate and yet reconstruct the image with a minimum loss of perceived quality. These early attempts used information theory to minimize the mean squared error (“MMSE”). But the human visual system (“HVS”) does not perceive quality in the MMSE sense, and the classical coding theory of MMSE did not necessarily yield results pleasing to the HVS. Further, classical MMSE theory applied to the human enjoyment of moving video scenes did not yield pleasing results.
For certain wavelengths, the human eye can see a single photon of light in a dark room. This sensitivity of the HVS also applies to quantization noise and coding artifacts within video scenes. The sensitivity of the HVS changes from one part of a video image to another. For example, human sensitivity to quantization noise and coding artifacts is less in the very bright and very dark areas of a video scene (contrast sensitivity). In busy image areas containing high texture or having large contrast or signal variance, the sensitivity of the HVS to distortion decreases. In these busy areas, the quantization noise and coding artifacts get lost in complex patterns. This is known as a masking effect. In smooth parts of an image with low variation, human sensitivity to contrast and distortion increases. For instance, a single fleck of pepper is immediately noticeable and out of place in a container of salt. Likewise, a single contrasting pixel out of place near a strong visual edge in an image may be noticeable and annoying.
The local variance of a video signal is often noticeable to the HVS on a very small scale: from pixel to pixel or from macroblock to macroblock. This means that filters that remove quantization noise and coding artifacts must filter digital data on a scale representing very small blocks of an image. Filters that remove ringing artifacts and sharpen edges of a visual image must perform calculations and operate on each macroblock, or more ideally, on each pixel in a digital image.
In order to convert an image to a digital representation suitable for quantization, an encoder must divide an image into small blocks, each having visual characteristics (such as detail complexity, brightness, and contrast with neighboring blocks) that may be expressed quantitatively for numerical processing. An image is usually partitioned into nonoverlapping blocks having 8 pixels on each side. One scheme for creating a digital representation of an image is discrete cosine transformation (“DCT”). DCT is one method of numerically approximating the visual attributes of an 8×8 pixel block of an image. Partitioning an image into small blocks before applying DCT (“block DCT”) reduces computational intensity and memory requirements. This simplifies hardware design. Accordingly, many of the image compression standards available use block DCT coding.
For low bit digital storage and transmission of an image, quantization noise and coding artifacts may appear in the displayed image due to the approximations introduced by block DCT and a quantization scheme, if present. In moving video scenes, these artifacts show as run-time snow and as dirty uncovered backgrounds. Significant artifacts among frames can result in run-time flicker if they are repetitive.
The objectionable artifacts that occur when pictures are coded at low bit rates are color bleeding, blurriness, blockiness, and ringing. Color bleeding is specific to strong chrominance edges. Blurriness is the result of loss of spatial detail in medium-textured and high-textured areas. The two most prevalent coding artifacts are blockiness resulting in loss of edge sharpness, and ringing, the intermittent distortion near visual object boundaries.
Blockiness is the artifact related to the appearance of the 8×8 DCT grid caused by coarse quantization in low-detail areas. This sometimes causes pixelation of straight lines and a loss of edge sharpness. Blockiness occurs when adjacent blocks in an image are processed separately from each other, and coding approximations assigned to each block cause a visual contrast between neighboring blocks that had visual continuity in the original image. For instance, if neighboring blocks lie in an area of the image where intensity is changing, the decoded intensity assigned to each block may not capture the original intensity gradient.
Ringing (also referred to as mosquito noise) occurs at edges on flat backgrounds where high frequencies are poorly quantized. Accordingly, ringing is usually associated with sharp image boundaries, such as text against a uniform background or computer graphics. Coarser quantization block DCT systems are typically ineffective when coding sharp visual edges, so decompressed images usually have distortion at these edges. Known de-ringing filters blur the true edges when they attempt to remove the ringing artifacts.
The majority of the image (JPEG) and video (MPEG) coding standards are based on the use of block DCT. One serious drawback of compressing images using these standards is the blockiness and ringing artifacts that occur in a decoded image.
Edge sharpening by strengthening the true visual edges in an image is a common post-processing method used to remove artifacts. U.S. Pat. No. 5,822,467 to Lopez et al., entitled “Sharpening Filter for Images with Automatic Adaptation to Image Type” is directed to a filtering method that uses Laplacian filter coefficients and normalization divisors. U.S. Pat. No. 5,818,972 to Girod et al., entitled “Method and Apparatus For Enhancing Images Using Helper Signals” is directed to a filtering method that uses helper signal architecture and nonlinear characteristic functions. U.S. Pat. No. 5,757,977 to Mancuzo et al., entitled “Fuzzy Logic Filter for Reducing Noise and Sharpening Edges of Digital Image Signals” is directed to a filter that reduces noise using a fuzzy logic comparison unit in two noise reduction circuits.
De-ringing is another common post-processing method used to remove artifacts. U.S. Pat. No. 5,819,035 to Devaney et al., entitled “Post-filter for Removing Ringing Artifacts of DCT Coding” is directed to a post-filter that performs anisotropic diffusion filtering on decoded data. U.S. Pat. No. 5,850,294 to Apostolopoulos et al., entitled “Method and Apparatus for Post-processing Images” is directed to reducing visual artifacts through separate detection, mapping, and smoothing functions. The Apostolopoulos method employs DCT-domain detection rather than edge detection in the pixel domain. U.S. Pat. No. 5,883,983 to Yee et al., entitled “Adaptive Post-processing System for Reducing Blocking Effects and Ringing Noise in Decompressed Image Signals” is directed to an adaptive filtering method that uses a binary edge map and filters using various weighting factors to generate the filtered pixel value. One of the drawbacks of known de-ringing filters is that they can result in blurring of the true edges while removing the ringing artifacts.