A video sequence consists of a number of pictures, usually called frames. Subsequent frames are very similar, thus containing a lot of redundancy from one frame to the next. Before being efficiently transmitted over a channel or stored in memory, video data is compressed to conserve both bandwidth and memory. The goal is to remove the redundancy to gain better compression ratios. A first video compression approach is to subtract a reference frame from a given frame to generate a relative difference. A compressed frame contains less information than the reference frame. The relative difference can be encoded at a lower bit-rate with the same quality. The decoder reconstructs the original frame by adding the relative difference to the reference frame.
A more sophisticated approach is to approximate the motion of the whole scene and the objects of a video sequence. The motion is described by parameters that are encoded in the bit-stream. Pixels of the predicted frame are approximated by appropriately translated pixels of the reference frame. This approach provides an improved predictive ability over a simple subtraction approach. However, the bit-rate occupied by the parameters of the motion model must not become too large.
In general, video compression is performed according to many standards, including one or more standards for audio and video compression from the Moving Picture Experts Group (MPEG), such as MPEG-1, MPEG-2, and MPEG-4. Additional enhancements have been made as part of the MPEG-4 part 10 standard, also referred to as H.264, or AVC (Advanced Video Coding). Under the MPEG standards, video data is first encoded (e.g. compressed) and then stored in an encoder buffer on an encoder side of a video system. Later, the encoded data is transmitted to a decoder side of the video system, where it is stored in a decoder buffer, before being decoded so that the corresponding pictures can be viewed.
MPEG is used for the generic coding of moving pictures and associated audio and creates a compressed video bit-stream made up of a series of three types of encoded data frames. The three types of data frames are an intra frame (called an I-frame or I-picture), a bi-directional predicated frame (called a B-frame or B-picture), and a forward predicted frame (called a P-frame or P-picture). These three types of frames can be arranged in a specified order called the GOP (Group Of Pictures) structure. I-frames contain all the information needed to reconstruct a picture. The I-frame is encoded as a normal image without motion compensation. On the other hand, P-frames use information from previous frames and B-frames use information from previous frames, a subsequent frame, or both to reconstruct a picture. Specifically, P-frames are predicted from a preceding I-frame or the immediately preceding P-frame.
Besides MPEG standards, JPEG is used for the generic coding of still picture. Since the encoding of still picture can be considered as the encoding of an I frame in video, no introduction of JPEG will be provided here. There are some other proprietary methods for image/video compression. Most of them adopt similar technologies as MPEG and JPEG. Basically, each picture is separated into one luminance (Y) and two chrominance channels (also called color difference signals Cb and Cr). Blocks of the luminance and chrominance arrays are organized into “macroblocks,” which are the basic unit of coding within a frame. Block based transformation and quantization of transform coefficients are used to achieve high compression efficiency.
Since quantization is a lossy process, the combination of block-based transform and quantization is able to generate perceptually annoying artifacts such as ringing artifacts and blocking artifacts. Since coding artifact reduction is fundamental to many image processing applications, it has been investigated for many years. Many post-processing methods have been proposed. In general, most methods focus on blocking artifacts reduction or ringing artifacts reduction. Although some methods show good results on selected applications, the quality is not high enough on new digital HDTV. As a result, either the artifacts are still visible or the texture detail is blurred.
Recently, more and more families are replacing their old CRT television with large screen LCD or Plasma televisions with high definition. While the new technologies provide better experience with higher resolution and more detail, they also reveal more obvious artifacts and noise if the contents are not high enough quality. For instance, displaying a YouTube® video clip on the HDTV will show very ugly coding artifacts.