The present invention relates to a decoder which converts and formats an encoded high resolution video signal, e.g. MPEG-2 encoded video signals, to a decoded lower resolution output video signal, and more specifically to an up-sampling filter for the decoder.
In the United States a standard has been proposed for digitally encoded high definition television signals (HDTV). A portion of this standard is essentially the same as the MPEG-2 standard, proposed by the Moving Picture Experts Group (MPEG) of the International Organization for Standardization (ISO). The standard is described in an International Standard (IS) publication entitled, xe2x80x9cInformation Technologyxe2x80x94Generic Coding of Moving Pictures and Associated Audio, Recommendation H.626xe2x80x9d, ISO/IEC 13818-2, IS, 11/94 which is available from the ISO and which is hereby incorporated by reference for its teaching on the MPEG-2 digital video coding standard.
The MPEG-2 standard is actually several different standards. In MPEG-2, several different profiles are defined, each corresponding to a different level of complexity of the encoded image. For each profile, different levels are defined, each level corresponding to a different image resolution. One of the MPEG-2 standards, known as Main Profile, Main Level is intended for coding video signals conforming to existing television standards (i.e., NTSC and PAL). Another standard, known as Main Profile, High Level, is intended for coding high-definition television images.
Images encoded according to the Main Profile, High Level standard may have as many as 1,152 active lines per image frame and 1,920 pixels per line.
The Main Profile, Main Level standard, on the other hand, defines a maximum picture size of 720 pixels per line and 576 lines per frame. At a frame rate of 30 frames per second, signals encoded according to this standard have a data rate of 720 * 576 * 30 or 12,441,600 pixels per second. By contrast, images encoded according to the Main Profile, High Level standard have a maximum data rate of 1,152 * 1,920 * 30 or 66,355,200 pixels per second. This data rate is more than five times the data rate of image data encoded according to the Main Profile, Main Level standard. The standard proposed for HDTV encoding in the United States is a subset of this standard, having as many as 1,080 lines per frame, 1,920 pixels per line and a maximum frame rate, for this frame size, of 30 frames per second. The maximum data rate for this proposed standard is still far greater than the maximum data rate for the Main Profile, Main Level standard.
The MPEG-2 standard defines a complex syntax which contains a mixture of data and control information. Some of this control information is used to enable signals having several different formats to be covered by the standard. These formats define images having differing numbers of picture elements (pixels) per line, differing numbers of lines per frame or field, and differing numbers of frames or fields per second. In addition, the basic syntax of the MPEG-2 Main Profile defines the compressed MPEG-2 bit stream representing a sequence of images in five layers, the sequence layer, the group of pictures layer, the picture layer, the slice layer and the macroblock layer. Each of these layers is introduced with control information. Finally, other control information, also known as side information, (e.g. frame type, macroblock pattern, image motion vectors, coefficient zig-zag patterns and dequantization information) are interspersed throughout the coded bit stream.
A down conversion system converts a high definition input picture into lower resolution picture for display on a lower resolution monitor. Down conversion of high resolution Main Profile, High Level pictures to Main Profile, Main Level pictures, or other lower resolution picture formats, has gained increased importance for reducing implementation costs of HDTV. Down conversion allows replacement of expensive high definition monitors used with Main Profile, High Level encoded pictures with inexpensive existing monitors which have a lower picture resolution to support, for example, Main Profile, Main Level encoded pictures, such as NTSC or 525 progressive monitors.
To effectively receive the digital images, a decoder should process the video signal information rapidly. To be optimally effective, the coding systems should be relatively inexpensive and yet have sufficient power to decode these digital signals in real time.
One method of down conversion of the prior art simply low pass filters and decimates the decoded high resolution, Main Profile, High Level picture to form an image suitable for display on a conventional television receiver. Consequently, using existing techniques, a decoder employing down conversion may be implemented using a single processor having a complex design, considerable memory, and operating on the spatial domain image at a high data rate to perform this function. The high resolution, and high data rate, however, requires very expensive circuitry, which would be contrary to the implementation of a decoder in a consumer television receiver in which cost is a major factor.
An apparatus for forming a set of low resolution down-sampled pixel values corresponding to a current frame of a video signal from a set of low resolution pixel values corresponding to a residual image of the current frame of the video signal and from a set of down-sampled low resolution pixel values corresponding to a reference frame of the video signal. The apparatus includes a memory means for storing the set of down-sampled low resolution pixel values corresponding to the reference frame of the video signal. An up-sampling means receives from the memory means and uses Lagrangian interpolation to convert the set of down-sampled low resolution pixel values corresponding to the reference frame of the video signal into a set of up-sampled low resolution pixel values corresponding to the reference frame of the video signal. A summing means adds the set of low resolution pixel values corresponding to the residual image of the current frame of the video signal to the set of up-sampled low resolution pixel values corresponding to the reference frame of the video signal to form a set of low resolution pixel values corresponding to the current frame of the video signal. A decimating means deletes selected ones of the set of low resolution pixel values corresponding to the current frame of the video signal to generate the set of low resolution down-sampled pixel values corresponding to the current frame of the video signal.