This invention relates to a decoder which converts and formats an encoded high resolution video signal, e.g., MPEG-2 encoded video signal, and more specifically to a method and apparatus for adaptively compensating for encoder/decoder mismatch.
In the United States a standard has been proposed for digitally encoded high definition television signals (HDTV). A portion of this standard is essentially the same as the MPEG-2 standard, proposed by the Moving Picture Experts Group (MPEG) of the International Organization for Standardization (ISO). The standard is described in an International Standard (IS) publication entitled, xe2x80x9cInformation Technologyxe2x80x94Generic Coding of Moving Pictures and Associated Audio, Recommendation H.626xe2x80x9d, ISO/IEC 13818-2, IS, 11/94 which is available from the ISO and which is hereby incorporated by reference for its teaching on the MPEG-2 digital video coding standard.
The MPEG-2 standard is actually several different standards. In MPEG-2, several different profiles are defined, each corresponding to a different level of complexity of the encoded image. For each profile, different levels are defined, each level corresponding to a different image resolution. One of the MPEG-2 standards, known as Main Profile, Main Level is intended for coding video signals conforming to existing television standards (i.e., NTSC and PAL). Another standard, known as Main Profile, High Level, is intended for coding high-definition television images. Images encoded according to the Main Profile, High Level standard may have as many as 1,152 lines per image frame and 1,920 pixels per line.
The Main Profile, Main Level standard, on the other hand, defines a maximum picture size of 720 pixels per line and 576 lines per frame. At a frame rate of 30 frames per second, signals encoded according to this standard have a data rate of 720xc3x97576xc3x9730 or 12,441,600 pixels per second. By contrast, images encoded according to the Main Profile, High Level standard have a maximum data rate of 1,152xc3x971,920xc3x9730 or 66,355,200 pixels per second. This data rate is more than five times the data rate of image data encoded according to the Main Profile, Main Level standard. The standard proposed for HDTV encoding in the United States is a subset of this standard, having as many as 1,080 lines per frame, 1,920 pixels per line and a maximum frame rate, for this frame size, of 30 frames per second. The maximum data rate for this proposed standard is still far greater than the maximum data rate for the Main Profile, Main Level standard.
The MPEG-2 standard defines a complex syntax which contains a mixture of data and control information. Some of this control information is used to enable signals having several different formats to be covered by the standard. These formats define images having differing numbers of picture elements (pixels) per line, differing numbers of lines per frame or field, and differing numbers of frames or fields per second. In addition, the basic syntax of the MPEG-2 Main Profile defines the compressed MPEG-2 bit stream representing a sequence of images in five layers, the sequence layer, the group of pictures layer, the picture layer, the slice layer and the macroblock layer. Each of these layers is introduced with control information. Finally, other control information, also known as side information, (e.g. frame type, macroblock pattern, image motion vectors, coefficient zig-zag patterns and dequantization information) is interspersed throughout the encoded bit stream.
Implementation of this standard in television studios and in viewer""s homes is expected to be incremental. At least until the television studios provide a large amount of programming in HDTV format, viewers are likely to retain their standard definition television (SDTV) receivers but may want to view HDTV programming in SDTV format. Thus, the operation of decoding the encoded bitstream may include the process of down conversion. Down conversion converts a high definition input picture into a lower resolution picture for display on a lower resolution monitor. Down conversion of high resolution Main Profile, High Level pictures to Main Profile, Main Level pictures, or other lower resolution picture formats, has gained increased importance for reducing implementation costs of HDTV. Down conversion allows replacement of expensive high definition monitors used with Main Profile, High Level encoded pictures with inexpensive existing monitors that have a lower picture resolution to support, for example, Main Profile, Main Level encoded pictures, such as NTSC or PAL.
Processing of video signals in the MPEG-2 standard includes converting the video signals between the spatial domain and the frequency domain using discrete cosine transforms (DCTs) and inverse discrete cosine transforms (IDCTs) during the respective encoding and decoding stages of the process. When the DCT used by an encoder and the IDCT used by a decoder have different implementations, a difference may occur in the reconstructed pixels between the encoder and the decoder. This difference may accumulate and become visible in the decoded picture. This distortion is called IDCT mismatch distortion because the visible distortion in the decoded picture is caused by different DCT/IDCT implementations in the encoder and decoder. IDCT mismatch is a serious problem for high quality coding schemes such as those conforming to the MPEG-1 and MPEG-2 standards. Thus, in order to achieve high coding quality, IDCT mismatch must be controlled.
IDCT mismatch occurs when the result of an IDCT is very close to a half integer. A slight difference between the encoder and decoder can result in two different rounded integer values. This difference is most likely to cause problems when the values of the IDCT results are close to a half integer (e.g., 1.5). When the IDCT results are rounded to the nearest integer, one implementation may round up, because its resultant value is only slightly greater than the value of the half integer, while the other implementation rounds down, because its resultant value is only slightly less than the value of the half integer. Accordingly, if a decoder that rounds up processes a signal from an encoder that rounds down or vice versa, IDCT mismatch errors may occur. When a decoded frame containing errors is used to decode a sequence of predicted frames, the error may become more visible with each predicted frame that is based on the erroneous frame. One approach to control IDCT mismatch includes oddification methods. The processing of oddification typically involves setting specific coefficients to an odd value. In this approach, the reconstructed or dequantized DCT data is oddified at the decoder before the IDCT step.
The present invention provides an apparatus for use in a video decoder which decodes digital video signals that have been encoded into frequency domain coefficient values. The apparatus comprises a mismatch control processor, a frequency domain filter having filter coefficients corresponding to frequency bands, and an inverse frequency domain transform processor. The mismatch control processor is coupled to receive the frequency domain coefficient values and to process the frequency domain coefficient values according to a mismatch control algorithm to produce processed frequency domain coefficient values. The frequency domain filter is coupled to receive the processed frequency domain coefficient values and to provide lowpass filtered frequency domain coefficient values. If down conversion is performed, the frequency domain filter coefficient corresponding to the highest frequency band is set to 1 at least for image blocks tha have been modified by the mismatch control processor. The inverse frequency domain transform processor is coupled to the frequency domain filter for transforming the output coefficient values provided by the frequency domain filter into spatial domain picture elements.
It is to be understood that both the foregoing general description and the following detailed description are exemplary, but are not restrictive, of the invention.