Digital video capabilities may be incorporated into a wide range of devices, including digital televisions, digital direct broadcast systems, wireless communication devices, personal digital assistants (PDAs), laptop computers, desktop computers, digital cameras, digital recording devices, mobile or satellite radio telephones, and the like. Digital video and picture devices can provide significant improvements over conventional analog video and picture systems in creating, modifying, transmitting, storing, recording and playing full motion video sequences and pictures. Video sequences (also referred to as video clips) are composed of a sequence of frames. A picture can also be represented as a frame. Any frame or part of a frame from a video or a picture is often called an image.
Digital devices such as mobile phones and hand-held digital cameras can take both pictures and/or video. The pictures and video sequences may be stored and transmitted to another device either wirelessly or through a cable. Prior to transmission the frame may be sampled and digitized. Once digitized, the frame may be parsed into smaller blocks and encoded. Encoding is sometimes synonymous with compression. Compression can reduce the overall (usually redundant) amount of data (i.e., bits) needed to represent a frame. By compressing video and image data, many image and video encoding standards allow for improved transmission rates of video sequences and images. Typically compressed video sequences and compressed images are referred to as encoded bitstream, encoded packets, or bitstream. Most image and video encoding standards utilize image/video compression techniques designed to facilitate video and image transmission with less transmitted bits than those used without compression techniques.
In order to support compression, a digital video and/or picture device typically includes an encoder for compressing digital video sequences or compressing a picture, and a decoder for decompressing the digital video sequences. In many cases, the encoder and decoder form an integrated encoder/decoder (CODEC) that operates on blocks of pixels within frames that define the video sequence. In standards, such as the International Telecommunication Union (ITU) H.264 and Moving Picture Experts Group (MPEG)-4, Joint Photographic Experts Group (JPEG), for example, the encoder typically divides a video frame or image to be transmitted into video blocks referred to as “macroblocks.” A macroblock is typically 16 pixels high by 16 pixels wide. Various sizes of video blocks may be used. Those ordinarily skilled in the art of image and video processing recognize that the term video block, or image block may be used interchangeably. Sometimes to be explicit in their interchangeability, the term image/video block is used. The ITU H.264 standard supports processing 16 by 16 video blocks, 16 by 8 video blocks, 8 by 16 image blocks, 8 by 8 image blocks, 8 by 4 image blocks, 4 by 8 image blocks and 4 by 4 image blocks. Other standards may support differently sized image blocks. Someone ordinarily skilled in the art sometimes use video block or frame interchangeably when describing an encoding process, and sometimes may refer to video block or frame as video matter. In general, video encoding standards support encoding and decoding a video unit, wherein a video unit may be a video block or a video frame.
For each video block in a video frame, an encoder operates in a number of “prediction” modes. In one mode, the encoder searches similarly sized video blocks of one or more immediately preceding video frames (or subsequent frames) to identify the most similar video block, referred to as the “best prediction block.” The process of comparing a current video block to video blocks of other frames is generally referred to as block-level motion estimation (BME). BME produces a motion vector for the respective block. Once a “best prediction block” is identified for a current video block, the encoder can encode the differences between the current video block and the best prediction block. This process of using the differences between the current video block and the best prediction block includes a process referred to as motion compensation. In particular, motion compensation usually refers to the act of fetching the best prediction block using a motion vector, and then subtracting the best prediction block from an input video block to generate a difference block. After motion compensation, a series of additional encoding steps are typically performed to finish encoding the difference block. These additional encoding steps may depend on the encoding standard being used. In another mode, the encoder searches similarly sized video blocks of one or more neighboring video blocks within the same frame and uses information from those blocks to aid in the encoding process.
In general, as part of the encoding process, a transform of the video block (or difference video block) is taken. The transform converts the video block (or difference video block) from being represented by pixels to being represented by transform coefficients. A typical transform in video encoding is called the Discrete Cosine Transform (DCT). The DCT transforms the video block data from the pixel domain to a spatial frequency domain. In the spatial frequency domain, data is represented by DCT block coefficients. The DCT block coefficients represent the number and degree of the spatial frequencies detected in the video block. After a DCT is computed, the DCT block coefficients may be quantized, in a process known as “block quantization.” Quantization of the DCT block coefficients (coming from either the video block or difference video block) removes part of the spatial redundancy from the block. During this “block quantization” process, further spatial redundancy may sometimes be removed by comparing the quantized DCT block coefficients to a threshold. If the magnitude of a quantized DCT block coefficient is less than the threshold, the coefficient is discarded or set to a zero value.
However, block quantization at the encoder may often cause different artifacts to appear at the decoder when reconstructing the video frames or images that have been compressed at the encoder. An example of an artifact is when blocks appear in the reconstructed video image, this is known as “blockiness.” Some standards have tried to address this problem by including a de-blocking filter as part of the encoding process. In some cases, the de-blocking filter removes the blockiness but also has the effect of smearing or blurring the video frame or image, which is known as a blurriness artifact. Hence, image/video quality suffers either from “blockiness” or blurriness from de-blocking filters. A method and apparatus that could reduce the effect of coding artifacts on the perceived visual quality may be a significant benefit.