This invention relates, generally, to the processing of video data and, more specifically, to the elimination of picture processing artifacts visible on images displayed from data with vertically sub-sampled chroma.
A thorough background article on chroma upsampling error entitled “DVD Benchmark—A Special Report—The Chroma Upsampling Error on DVD Players-April 2001 by Don Munsil and Stacey Spears”, was published by “hometheaterhifi.com” in April 2001, since updated, and is hereby incorporated by reference.
The method was invented to address a problem of the quality of displayed MPEG compressed images which can be found in nearly every DVD player currently being offered for sale. Many MPEG compression based satellite or cable television receivers also display this difficulty. The effect is seen as jagged edges on brightly colored objects that appear in the display of the decoded image. The problem stems from the improper conversion of the “4:2:0” chroma format (where each of the two chroma components have half the number of samples per line and half the number of lines as the luma component), used for example by the more popular subsets of the MPEG video standards, to the “4:2:2” chroma format (where each of the two chroma components have half the number of samples per line and the same number of lines as the luma component) or “4:4:4” chroma format (where each of the two chroma components have the same number of samples per line and the same number of lines as the luma component) required in order to create an image suitable for viewing on standard interlaced or progressive scan television displays. Although methods are known in the prior art that, to some degree, alleviate the artifacts caused by this problem, a complete solution has not been available in the past.
In the following, “luma” or “luminance,” is called “Y”. Depending on the color system, the two “chroma” or “chrominance” signals may be labeled “U” and “V,” “I” and “Q,” “Pb” and “Pr,” or “Cb” and “Cr” (which will be used in the following). These are slightly different image signal encoding formats, but they are all essentially the same in concept. In the MPEG video compression process used in the DVD system, the Y component is compressed at full resolution. The Cb and Cr color signal components are compressed at a lower resolution, both horizontally and vertically. When a video image is first captured for eventual distribution on DVD, it is usually captured using the same luma and chroma resolution format, called “4:4:4”, which means that for every luma sample there is one sample of Cb and one sample of Cr. In other words, the color signal contains chroma information relating to each of the luma samples. The 4:4:4 full resolution chroma format is generally only used internally within a device to avoid degradation during processing.
When video program content is recorded to a master tape to be used as a DVD video source, it is usually reduced to the 4:2:2 format, a lower resolution chroma format. In the “4:2:2” format, every two luma horizontally adjacent samples are associated with one sample of Cb and one sample of Cr. A yet lower resolution chroma format, and the one standard for DVD, is “4:2:0”. For the “4:2:0” format, there are half as many samples of Cb and Cr on each scan line, and half as many scan lines of Cb and Cr as compared to Y. In other words, the resolution for chroma is half that of luma in both the horizontal and vertical directions. For example, if a color image of 720×480 image samples (“pixels”) is encoded in the “4:2:0” format, the chroma information for this image would be represented by 360×240 Cb samples and 360×240 Cr samples. In order to display an image originating in the “4:2:0” format, the missing chroma samples need to be interpolated on each scan line from the chroma samples on either side of each missing chroma sample, and entire scan lines of chroma information need to be interpolated from the chroma scan lines above and below each missing chroma scan line. This process is called chroma upsampling. Care must be taken to properly upsample the “4:2:0” format in order to avoid chroma errors.
Movie images such as stored on DVD, are, for the most part, displayed in one of two ways, either in a progressive manner or an interlaced manner. If the images are displayed progressively, each line of image data drawn on the display screen is preceded and followed by an adjacent line of image data. If the image is displayed in an interlaced manner, half of the lines that comprise a full set of image data are displayed first, followed by the other half of the image data lines. This second half of image date lines are drawn between the first half of image data lines. If all the lines of an image are drawn during one pass from the top of the image to the bottom of the image, as is the case for a progressively scanned image, this single scanning pass is called a frame. If only half of the lines are displayed first, as is the case for an interlaced image, this single scanning pass is called a field. In this latter case, two fields would equal a frame. These two fields are known as the odd field and the even field, the top field and the bottom field, or field 1 and field 2.
The movie images, such as stored on DVD, can be obtained from an interlaced video source, such as a television camera, or from motion picture film or a computer generated image, which serve as a progressive video source. In either case, DVD images may be stored in full frames, or in separate fields. If the original source was interlaced, the MPEG encoder that writes the DVD may take pairs of fields, weave them together into full frames, compress the frames and store those compressed frames on the DVD disc. More detail on the interlaced to progressive conversion are given in copending U.S. patent applications Ser. No. 10/033,219, filed Dec. 27, 2001, entitled “Techniques For Determining the Slope of a Field Pixel”, and Ser. No. 10/119,999, filed Apr. 9, 2002, entitled “2:2 and 3:2 Pull-Down Detection Techniques”, both of which are commonly assigned with the present application and are hereby incorporated by reference. If the original source was progressive, the original frames may be compressed and written to disc, or may be separated into two fields, to be compressed separately and stored. For compressed progressive frames, the MPEG encoder sets the “progressive_frame” flag to “True”. During DVD playback, if the original video was interlaced, then it is important that the chroma information for one field not leak into the other field. To do this, the MPEG decoder needs to split compressed frames into two fields, and then upsample the “4:2:0” format chroma data to the “4:2:2” format separately for each field. If, however, the frame was originally progressive, then the chroma information needs to be upsampled to “4:2:2” across the whole frame, then split again into fields, if it is to be displayed on an interlaced scanned television receiver.
Note that (e.g. for MPEG) the chroma lines are visualized as being located between the luma lines. This is because the missing lines of chroma samples in the “4:2:0” encoding format are averaged from the chroma information carried on the original scan lines which are above and below the missing scan lines. For a progressively scanned 4:2:0 MPEG encoded image, the first line of chroma samples is created by averaging lines 0 and 1 from the original image, and the second line is created from image lines 2 and 3, and so forth.
When two consecutive interlaced 4:2:0 fields are compressed and stored together as a single MPEG encoded frame, the chroma values are still located between the scan lines, but there's an important difference. The first line of chroma values is averaged not from lines 0 and 1, but from lines 0 and 2, and assuming that the MPEG encoder is properly designed, these values should be averaged using 75% line 0 and 25% line 2. This is because the derived line is defined to be physically closer to line 0 than line 2. Thus, the first “virtual line” of chroma values is still between image lines 0 and 1. But line 1 is not used for field 0; it is part of field 1. So the chroma information is averaged from lines 0 and 2. The “virtual line” of chroma is 0.5 lines away from line 0, and 1.5 lines away from line 2. So line 0, being 3 times closer, should affect the derived chroma sample three times as much as line 2. However, some MPEG encoders, in order to reduce complexity and cost, perform a simple 50/50 average of lines 0 and 2, or use line 0, and throw away line 2. Note, though, that picture line 1 is not involved in encoding the first chroma line at all. Line 1 is the first line of the next field, and color from field 1 must be prevented from bleeding into field 0. This is because there may be movement between the two fields, resulting in the display of very noticeable color artifacts. In the next field, field 1, the second chroma line affects picture lines 1 and 3, but again, the chroma sample is closer to line 3 than line 1, so the sample should be calculated as 75% line 3's chroma and 25% line 1's chroma. More sophisticated algorithms than a simple 75%/25% average, using more than 1 data points, are sometimes employed by MPEG encoders. However it needs to be pointed out that although chroma information should not be calculated from a simple 50%/50% average, many of the MPEG encoders used today follow this approach.
To view an image in 4:2:0 format on an output display device, it is necessary to convert it back to 4:2:2 or 4:4:4 format so that it is compatible with the video encoder which converts the digital signals to an analog format accepted by such a display device. It is not very important whether the output device converts the 4:2:0 information to 4:2:2 or 4:4:4; many video encoders can accept either format, but it must be one of these formats. In order for the video encoder to perform this conversion, it needs to simultaneously be provided with a line of chroma for every line of luma, because it is converting the digital video to analog video in real time. The display cannot “remember” what the chroma information was for the previous scan line, so it cannot interpolate missing lines of chroma information. Only 4:2:2 and 4:4:4 have a 1:1 ratio of chroma and luma scan lines, so the MPEG decoder needs to output one or the other.
The MPEG pictures are stored in the same format whether they represent a progressive frame or two interlaced fields woven together. A “progressive_frame” flag is set to tell the decoder whether the chroma information should be interpolated across the whole picture, or separated into fields and interpolated separately for each one. Unfortunately, most prior art MPEG decoders ignore the “progressive_frame” flag, and only perform one kind of interpolation in order to make an MPEG integrated circuit chip simpler and thereby reduce cost. Also, there are often errors with the MPEG flags. It is not uncommon for the “progressive_frame” flag to not be set when it should be set; and it is not uncommon for the “progressive_frame” flag to be set when it should not be set. MPEG decoders that use only one algorithm tend to use the interlaced algorithm, most likely because it's possible to implement it with less buffer memory, as only one field at a time must be upsampled. When the interlaced algorithm is applied to progressive images, chroma samples that were supposed to be used for scan lines 1 and 2 are instead interpolated to scan lines 1 and 3, and the chroma samples for scan lines 3 and 4 instead are affecting 2 and 4. In the simplest case, where the chroma lines are just copied to the adjacent image lines, the end result is that scan line 2 gets the chroma information for scan line 3, and vice versa, all the way down the screen. Effectively, adjacent pairs of chroma scan lines are switched, which produces the characteristic jagged/streaky visual effect known as the chroma upsampling error.
The chroma upsampling error does not look the same on all DVD players because the chroma sample interpolation scheme varies from player to player. As previously stated, in the simplest case, data from the 4:2:0 samples are copied twice to create the 4:2:2 data. A player using this approach displays chroma data that tends to look blocky with accentuated jaggedness. Other players employing more sophisticated interpolation schemes, such as bilinear filtering, or sin(x)/x filtering which smoothes the chroma channel to some degree, can make the upsampling error less visible, but not completely hidden, especially when there is a sharp edge in the image. Such a system is described in U.S. Pat. No. 5,650,824, “Method For MPEG-2 4:2:2 and 4:2:0 Chroma Format Conversion”, Si Jun Huang inventor, which is hereby incorporated by reference. The method described in this patent receives incoming video signals and separates them according to whether they are in progressive or interlaced format and then, for interlaced format, into even and odd fields. For each of these cases, a fixed but different filter is used based on these constructs, independent of both the characteristics of the particular frame or field or the frame to frame or field to field variations. In this case, the jaggedness and streaks will have a smoother appearance, but the interpolation may create additional “color bleeding” artifacts and be very visible to an experienced viewer.
To help solve the color-bleeding problem, another approach is offered by U.S. Pat. No. 6,297,801, Edge Adaptive Chroma Up-Conversion, Hong Jiang inventor, which is hereby incorporate by reference. Unlike the other prior art methods of chroma upsampling previously discussed, which use replication or linear interpolation of chroma values to effect chroma up-sampling, this technique relies on intra-field luma information in the locations surrounding a chroma sample location to effect chroma up-sampling. This is essentially a horizontal upsampling technique and assumes a correlation between luma and chroma components and uses neighboring luma components to calculate missing chroma values. When a strong edge exists in the luma image at a particular location, the chroma components at this location are assumed to also have edges. Thus, instead of using linear interpolation with a weighting factor as a function of spatial distance, the weighting factor is adjusted according to the variation in the luma values. Although this approach may help prevent color bleeding at sharp image transitions, when progressively scanned images using lower resolution chroma formats are upsampled to higher resolution chroma formats, it adds significant complexity when interlaced images are being processed. For interlaced video formats, the separation of adjacent rows in the video data increases, requiring much larger data storage if this technique is to be used. In addition, the methods of U.S. Pat. No. 6,297,801 require the use of a frame buffering format which stores vertically adjacent rows near each other, or a technique that allows rapid access of non-contiguous data, in order to be adapted to interlaced images. Therefore, from a practical standpoint, it does not help reduce the chroma upsampling error.