This invention generally relates to improving video and graphics quality.
MPEG compression is a widely used algorithm for digital video signal transmission and storage. MPEG encoded and decoded streams of video can be used in various applications including cable television, satellite television, and digital video discs (DVD).
The content of a video signal generally comprises a sequence of image frames for a progressive video sequence and image fields for an interlaced video sequence. Each frame/field consists of a rectangular spatial area of pixels. When video content is encoded using MPEG, generally an 8×8 window of pixels (64 pixels) in an image frame of the video signal is processed as follows. First, a Discrete Cosine Transform (DCT) is applied to the window to generate a 2-D spatial spectrum representation of the 8×8 window. This 2-D spatial spectrum is often referred to as a Fourier image, as it is a representation of the image in the Fourier domain. The Fourier image also has 64 pixels. The pixel values in the Fourier image represent a DC component and various frequencies or AC components. The DC component is generally situated in a top left corner pixel of the Fourier image. The other 63 pixels in the Fourier image represent the AC components. After generating the Fourier image, an MPEG encoder quantization is applied so that all the 64 pixels in the Fourier image are quantized.
The MPEG-2 standard provides for intraframe compression. In a video sequence, neighboring image frames are grouped into one or more Groups of Pictures (“GOP”). In a GOP, one image frame is encoded spatially, namely the I-frame. For other frames, differences are encoded. There are two types of frames where differences are encoded: P-frame and B-frame. For a P-frame, the difference between a current frame and a modified by motion vectors I-frame is spatially encoded. For a B-frame, the difference between a current frame and a weighting sum of a modified by motion vectors I-frame and P-frame or two P-frames is spatially encoded. “Modified by motion vectors” means that the currently encoded P- (or B-) frame is split by 16×16 pixel squares, and for each square the best matched square located with some spatial offset from the reference frame is searched. The searching happens in some local area. The spatial offset (vertical and horizontal) for the best matching block is kept in the MPEG stream and called the motion vector. Each 16×16 block of P-frame has one motion vector, and the B-frame has two motion vectors. MPEG compression for an interlaced signal processes fields instead of frames.
When an image is decoded using MPEG and MPEG-2 standards, the image frame is converted back from the Fourier domain to the spatial domain. The encoding and decoding of image frames using MPEG compression causes artifacts to appear in a processed image frame. The compression ratio, i.e. bit rate used in the MPEG encoding and decoding defines the level and behavior of the artifacts. That is, the visually perceived effect of the artifacts is a function of the bit rate.
There are various types of artifacts which can appear in image frames. When low compression rates are used, for example, less than 2 Megabits per second (Mbits/sec) the most perceivable are blocking artifacts which appear in certain areas of the image frame, particularly textured areas and etched or line areas. These blocking artifacts are often visually perceived as an image or object that appears to be divided into blocks. Pixel values inside the image frame are affected, introducing artifacts which appear as abrupt transitions between neighboring windows or blocks of the image frame. These abrupt transitions are generally aligned vertically and horizontally in the image frame and make the artifact very perceivable. In edge (line) areas, blocking artifacts are caused by transitions between neighboring windows or blocks in the presence of natural edges and lines, for instance, tree branches, wires, or edges between objects. In these edge areas, additional jaggedness is visually perceived. Increasing the bit rate to an intermediate or high level, for instance, higher than 2 Mbits/sec, can effectively reduce blocking artifacts occurring at low bit rates. For low bit rates, other artifacts such as mosquito noise, and flat area blocking artifact, can also take place.
For bit rates greater than 2 Mbits/sec, the blocking artifact is less perceivable. The main artifact appearing at these and higher bit rates is mosquito noise. Mosquito noise is a high frequency pattern that appears inside a window or block of pixels particularly in a more or less flat area in the presence of a high edge in a neighborhood or any other high transition between pixel values. The mosquito noise appears as a small checkerboard mixed with delta−impulse pattern that is clearly visible in areas within the window. Mosquito noise becomes visible due to the uniform spatial distribution of quantization noise appearing in blocks which contain generally smooth areas in the presence of strong edges. The mosquito noise is perceptually visible in the smooth areas. Pure vertical and horizontal intrablock ringing is one type of the mosquito noise. Here, mosquito noise appears close to vertical and horizontal edges in the image frame. The mosquito noise caused by pure vertical or horizontal edges is less severe than that caused by diagonal structures, but is still visible as vertical and horizontal ringing of the edges.
Also, in flat or smooth areas of the image frame, a flat area (DC) blocking artifact is perceptually visible at intermediate and high bit rates (greater than about 2 Mbits/sec). The flat area blocking artifact is caused by the quantized block essentially containing only one DC component, i.e. values of the pixels of the decoded block are the same. Perceptually, the smooth flat area appears as tiled 8×8 squares having close but different values. Thus, there is a distinguishable blocking pattern with smooth areas inside the blocks and rectangular transitions between neighboring blocks. The transitions are clearly visible because the transitions are generally aligned vertically and horizontally.
In modern video processing applications, often the source of a video signal is unknown. The video signal maybe digital or analog, and it could be transmitted from a DVD player, cable television source, satellite, or a montage of images from different sources. For instance, the video signal may be a combination from several analog and digital sources. Thus, any technique for artifacts reduction needs to perform effectively independent of any knowledge about the source of the video signal, including any knowledge about window or block boundaries in an image frame or video signal. Such knowledge might include information about edges, texture information and other information. If such knowledge was required, MPEG artifacts reduction techniques would be unnecessarily complex and hardware and time consuming.
Video sequences can also be affected by channel additive Gaussian Noise independent from MPEG artifacts.
Therefore, what is needed is a technique for reducing artifacts occurring at intermediate and higher compression rates in the context of MPEG compression that is effective without knowledge about block boundaries or other information as to the content of the image frames in the video signal with or without the presence of Gaussian Noise.