The present invention relates to processing of video data, and more specifically to detection and reduction of color-motion artifacts in processing of video data.
Successive frames in a typical video sequence are often very similar to each other. For example, a sequence of frames may have scenes in which an object moves across a stationary background, or a background moves behind a stationary object. Consequently, many scenes in one frame may also appear in a different position of a subsequent frame. Video systems take advantage of such redundancy within the frames by using predictive coding techniques, such as motion estimation and motion compensation, to reduce the volume of data required in compressing the frames.
In accordance with the well-known motion estimation technique, to conserve bit rate, data related to the differences between positions of similar objects in successive frames are captured by one or more motion vectors. The motion vectors are then used to identify the spatial coordinates of the shifted objects in a subsequent frame. The motion vectors therefore limit the bit rate that would otherwise be required to encode the data associated with the shifted objects.
Motion estimation and compensation are used in several international standards such as H.261, H.263, MPEG-1, MPEG-2, and MPEG-4. Partly due to its computational intensity, a motion vector is shared typically by all color components in (Y,U,V) or (Y, Cr,Cb) coordinate systems. In the (Y,U,V) color coordinate system, Y is the luma component (also referred to below as the luminance and is related to the intensity), and U and V are the chroma components (also referred to below as the chrominance components and are related to hue and saturation) of a color. Similarly, in the (Y, Cr,Cb) color coordinate system, Y is the luma component, and Cb and Cr are the chroma components. Each motion vector is typically generated for a macroblock. Each macroblock typically includes a block of, e.g., 16×16 or 8×8 pixels. The MPEG-2 standard provides an interlaced mode that separates each 16×16 macroblock into two 16×8 sub-macroblocks each having an associated motion vector. In the following, the terms block, macroblock, and sub-macroblock may be used interchangeably.
To simplify computation, most commonly known video standards use only the luminance component to generate a motion vector for each macroblock. This motion vector is subsequently applied to both chroma components associated with that macroblock. The generation of a motion vector using only the luminance component may cause undesirable color-motion artifacts (alternatively referred to hereinbelow as color artifacts) such as color patches.
To further the reduce bit rate for encoding of video data, inter-frame and intra-frame encoding have been developed. In accordance with the inter-frame coding, the difference between the data contained in a previous frame and a current frame is used to encode the current frame. The inter-frame encoding may not improve coding efficiency, for example, if the current frame is the first frame in a new scene (i.e., when there is a scene change), in which case intra-frame encoding is used. In accordance with the intra-frame encoding, only the information contained within the frame itself is used to encode the frame.
FIG. 1 is a simplified high-level block diagram of a conventional system 100 adapted to detect color-motion artifacts. System 100 receives a sequence of incoming video frames via frame reorder block 102. In response, frame reorder block 102 serially supplies the (Y,U,V) components of a current frame of the frame sequence to an adder/subtractor 104. Adder/subtractor 104 is also adapted to receive a signal from motion compensation block 106 via selector 108. If selector 108 is in the upper position, then intra-frame coding is used in which case a null signal (i.e., 0) is supplied to adder/subtractor 104. On the other hand, if selector 108 is in the lower position, then inter-frame coding is used. Adder/subtractor 104 generates a signal that corresponds to the difference between the video data supplied by frame reorder block 102 and that supplied by selector 108.
The signal generated by adder/subtractor 104 is supplied to a discrete cosine transform (DCT) block 110 whose output signal is quantized by a quantizer 112. The quantized signal generated by quantizer 112 is then encoded by variable-length coder (VLC) 114. The signal encoded by VLC 114 is then stored in buffer 116, which in turn, supplies the encoded video bit stream to a video decoder (not shown).
The signal generated by quantizer 112 is inversely quantized by an inverse quantizer 118 and is subsequently delivered to an inverse DCT (IDCT) 120. IDCT 20 performs an inverse DCT function on the signal it receives and supplies that signal to adder 122. Adder 122 adds the signal it receives from selector 108 to the signal it receives from IDCT 120 and stores the added result in frame memory 124 for future retrieval. The signal stored in frame memory 124 only includes the luma component of a current frame and is adapted to serve as a reference frame for motion estimation and compensation of future frames.
A motion estimator 128 receives the signal stored in frame memory 124 and the signal supplied by frame reorder block 102 to generate a motion vector signal that is supplied to motion compensation block 106. Only the luma components of the current frame—as supplied by frame reorder block 102—and the reference frame—as supplied by frame memory 124—are received and used by motion estimator 128 to generate a motion vector. The motion vector generated by motion estimator 128 is supplied to motion compensator 106. Motion compensator 106, in turn, compensates for the motion of the signal it receives from frame memory 124 using the motion vector signal that it receives from motion estimator 128. The output signal generated by motion compensator 106 is a motion-compensated signal of a current frame and serves as the reference frame for the next incoming frame when inter-frame encoding is used.
There may be occasions when a reference frame is not required. For example, no reference frame is required when a new video sequence is received by system 100. Similarly, there is no need for a reference frame when processing the first frame of a new scene. To accommodate situations where no reference frame is needed, selector 108 is provided with an upper position. When placed in the upper position, a null signal (i.e., 0) is transferred to subtracted 104.
Conventional luminance-only based motion estimation and compensation systems, such as system 100, fail to reflect the true movement of an object in a video sequence. This failure results in noticeable color-motion artifacts. FIG. 2 is an exemplary diagram showing color-motion artifacts stemming from failure to detect the motion of a color object. In FIG. 2, a uniform gray area 200 provides a background to two synthetic color patches, namely a red color patch 210 and a green color patch 220. Both red and green color patches 210 and 220 produce the same luminance level as gray background 200. Both color patches 210 and 220 are also moving in front of gray background 200.
Assume that in FIG. 2, the color conversion recommended by the ITU-R standard BT.709 is used, as shown below:Y=0.715G+0.072B+0.213R.Assume further that gray background 200 has (R,G,B) color components of (40,40,40) resulting in a luma component (i.e., intensity level) of 40. Assume further that red color patch 210 and green color patch 220 have respective (R,G,B) color components of (188,0,0) and (0,56,0). Consequently, in accordance with equation (1) above, both red color patch 210 and green color patch 220 have the same luminance level (i.e., 40) as the gray background 200. Therefore, conventional luminance-only based motion estimation and motion compensation systems, such as system 100 of FIG. 1, fail to detect the movement of red and green color patches 210 and 220 relative to gray background 200. This failure results in noticeable differences in the quantization errors of the chroma components—particularly for the gray background—thereby creating color motion artifacts.
One known method for overcoming the problems associated with luminance-only motion estimation and compensation systems is described in U.S. Pat. No. 5,544,263 and which involves performing motion estimation for each of the color components (Y,U,V). This three-motion-vector method, while achieving a good match for each of the color components, substantially increases the computational intensity. The increase in computational intensity, increases both the cost of the system as well as the bandwidth requirement for transmitting the three motion vectors. This, in turn, limits the available bit rate to encode the motion-compensated inter-frame differences. Therefore, although the method as described in U.S. Pat. No. 5,544,263 has an improved color motion artifact rejection, because it provides less bandwidth to encode the inter-frame differences, it is less immune to granular noise.
Another known method for overcoming the problems associated with luminance-only motion estimation and compensation systems is described in U.S. Pat. No. 5,544,263 and which involves using all color components when matching blocks. In accordance with this method, block-matching for each macro block is performed only once. However, this technique may require 50% more computation than do the luminance-only motion-estimation and compensation systems. This 50% increase in computation increases the system complexity and cost.
Need continues to exist for improved color motion artifact detection and reduction techniques.