1. Field of the Invention
The present disclosure relates to systems and methods for processing digital image signals. More particularly, the invention relates to a system and method that improves image quality by reducing the harshness of distortions in compressed digital image signals.
2. Discussion of the Related Art
A digital image signal generally contains information associated with a plurality of picture elements, e.g., pixels, Digital images typically contain large amounts of information (e.g., color and brightness information related to each of the plurality of pixels) needed to reproduce the image. As a result, data compression is often implemented to reduce the amount of memory that images require for processing and storage. Data compression is important not just for long term digital storage of an image but for permitting reasonable data transfer rates over network connected devices.
JPEG is a standardized image compression mechanism. JPEG stands for Joint Photographic Experts Group, the original name of the committee that wrote the standard. JPEG is designed for compressing either full-color or gray-scale digital images of “natural,” real-world scenes. JPEG compression does not work very well on non-realistic images, such as cartoons or line drawings. JPEG compression does not handle black-and-white (1-bit-per-pixel) images, nor does it handle motion picture compression. Related standards for compressing those types of images exist, and are called JBIG and MPEG respectively. Regular JPEG is “lossy,” meaning that the image you get out of decompression is not identical to what you originally put in. The algorithm achieves much of its compression by exploiting known limitations of the human eye, notably the fact that small color variations are not perceived as well as small variations in brightness.
The JPEG compression process is a multi-parameter compression process. By adjusting the parameters, you can trade off compressed image size against reconstructed image quality over a very wide range. In general, the baseline JPEG compression process performs the following steps:    1. Transform the image into a suitable color space. This is a no-op for grayscale images. For color images, RGB information is transformed into a luminance/chrominance color space (e.g., YCbCr, YUV, etc). The luminance component is grayscale and the other two axes are color information.    2. (Optional) Down sample each component by averaging together groups of pixels. The luminance component is left at full resolution, while the chroma components are often reduced 2:1 horizontally and either 2:1 or 1:1 (no change) vertically. In JPEG, these alternatives are usually called 2h2v and 2h1v sampling, but you may also see the terms “411” and “422” sampling. This step immediately reduces the data volume by one-half or one-third. In numerical terms it is highly lossy, but for most images it has almost no impact on perceived quality, because of the eye's poorer resolution for chroma info. Note that down sampling is not applicable to grayscale data; this is one reason color images are more compressible than grayscale.    3. Group the pixel values for each component into 8×8 blocks. Transform each 8×8 block through a discrete cosine transform (DCT). The DCT is a relative of the Fourier transform and likewise gives a frequency map, with 8×8 components. Thus you now have numbers representing the average value in each block and successively higher-frequency changes within the block. The motivation for doing this is that you can now throw away high-frequency information without affecting low-frequency information. (The DCT transform itself is reversible except for round off error.)    4. In each block, divide each of the 64 frequency components by a separate “quantization coefficient” and round the results to integers. This is the fundamental information-losing step. The larger the quantization coefficients, the more data is discarded. Note that even the minimum possible quantization coefficient, 1, loses some info, because the exact DCT outputs are typically not integers. Higher frequencies are always quantized less accurately (given larger coefficients) than lower, since they are less visible to the eye. Also, the luminance data is typically quantized more accurately than the chroma data, by using separate 64-element quantization tables.    5. Encode the reduced coefficients using either Huffman or arithmetic coding.    6. Tack on appropriate headers, etc., and output the result. In normal “interchange” JPEG file, all of the compression parameters are included in the headers so that the decompressor can reverse the process. These parameters include the quantization tables and the Huffman coding tables.(See generally pages 1–2 “Introduction to JPEG” http://www.faq.org/faqs/compression-faq/part2/section-6.html)
A series of digital image signals may be concatenated (i.e., strung together in series) to form a video or video sequence. Consider the case of a video sequence where nothing is moving in the scene. Each frame of the video should be exactly the same as the previous frame. In a digital system, it should be clear that a single frame and a repetition count could represent this video sequence.
Consider now, a man walking across the same scene. If information regarding the motion of the man can be extracted from the static background a great deal of storage space can be saved. This oversimplified case reveals two of the most difficult problems in motion compensation: 1) determining if an image is stationary; and 2) determining how and what portion of an image to extract for the portion of the image that moves.
These problems are addressed in the Moving Pictures Experts Group (MPEG) digital video and audio compression standard. In particular, the standard defines a compressed bit stream, which implicitly defines a decompressor. The most fundamental difference between MPEG and JPEG is MPEG's use of block-based motion compensated prediction (MCP), a general method which uses a temporal differential pulse code modulation (DPCM) scheme.
Usually, MCP and related block-based error coding techniques perform well when the image can be modeled locally as translational motion. However, when there is complex motion or new imagery, these error coding schemes may perform poorly, and the error signal may be harder to encode than the original signal. In such cases, it is sometimes better to suppress the error-coding scheme and code the original signal itself. It may be determined on a block-by-block basis whether to use an error-coding scheme and code the error signal, or to simply code the original signal. This type of coding is often referred to as inter/intra processing, because the encoder switches between inter-frame and intra-frame processing.
Block-based MCP and inter/intra decision-making are the basic temporal processing elements for many conventional video compression standards. Generally, these block-based temporal processing schemes perform well over a wide range of image scenes, enable simpler implementation than other approaches, and interface reasonably well with any block DCT processing of the error signal.
For complex scenes and/or low bit rates, a number of visual artifacts may appear as a result of signal distortion from a compression system. The primary visual artifacts affecting current image compression systems are blocking effects and intermittent distortions, often near object boundaries, often called mosquito noise. Other artifacts include ripple, contouring, and loss of resolution.
Blocking effects generally result from discontinuities in the reconstructed signal's characteristics across block boundaries for a block-based coding system, e.g., block DCT. Blocking effects are produced because adjacent blocks in an image are processed independently and the resulting independent distortion from block to block causes a lack of continuity between neighboring blocks. The lack of continuity may be in the form of abrupt changes in the signal intensity or signal gradient. In addition, block-type contouring, which is a special case of blocking effect, often results in instances when the intensity of an image is slowly changing.
Mosquito noise is typically seen when there is a sharp edge, e.g. an edge within a block separating two uniform but distinct regions. Block DCT applications are not effective at representing sharp edges. Accordingly, there is considerable distortion at sharp edges: the reconstructed edges are not as sharp as normal and the adjacent regions are not as uniform as they should be. Mosquito noise is especially evident in images containing text or computer graphics.
Many of the image compression standards available today, e.g. H261, JPEG, MPEG-1, MPEG-2, and High Definition Television (HDTV), are based on block DCT coding. Thus, most reproduced images may be adversely affected by blocking effects and edge distortion.
In addition to the image artifacts introduced by video signal compression and decompression, today's community antenna television (CATV), digital broadcast satellite (DBS), and digital television (DTV) broadcasters, as well as, other deliverers of compressed digital images, are faced with a plethora of end user consumer electronics solutions for displaying the images. For example, consumer electronics manufacturers are presently offering HDTV, DTV, and analog TV units. Also on the market are a wide range of personal computer (PC) based TV tuner cards that are capable of displaying full HDTV resolutions on appropriate multi-scan monitors. Indeed, multi-scan monitors with TV tuners are being made even larger to accommodate progressive scan signals on monitors that look like traditional TVs.
Digital TVs generally fall into three main categories: integrated high definition sets that include a digital receiver and display; digital set-top boxes designed to work with HD and standard definition (SD) digital displays (and, in some cases, with current analog sets); and DTV-capable displays that, with the addition of a digital set-top box, offer a complete DTV system.
Heretofore, DTV receivers designated for the home theater market generally include a large-screen “digital ready” display and—at extra cost—a separate set-top box that encodes analog TV signals and provides the signals to the DTV receiver. As a result, consumers can watch big, beautiful, analog generated pictures now, and later, when more digital programming becomes available, they can purchase a decoder box to view digitally generated programming at HDTV resolutions.
These decoder boxes will also prolong the life of current analog TVs, as consumers will be able to view digitally generated programming on their old TV set (i.e., an analog black and white and/or color TV). Whether the set-top box is functioning as a encoder or a decoder both analog TVs and DTVs are adversely affected by the block DCT coding introduced image artifacts.