In the United States a standard, the Advanced Television System Committee (ATSC) standard defines digital encoding of high definition television (HDTV) signals. A portion of this standard is essentially the same as the MPEG-2 standard, proposed by the Moving Picture Experts Group (MPEG) of the International Organization for Standardization (ISO). The standard is described in an International Standard (IS) publication entitled, "Information Technology--Generic Coding of Moving Pictures and Associated Audio, Recommendation H.626", ISO/IEC 13818-2, IS, 11/94 which is available from the ISO and which is hereby incorporated by reference for its teaching on the MPEG-2 digital video coding standard.
The MPEG-2 standard is actually several different standards. In MPEG-2 several different profiles are defined, each corresponding to a different level of complexity of the encoded image. For each profile, different levels are defined, each level corresponding to a different image resolution. One of the MPEG-2 standards, known as Main Profile, Main Level is intended for coding video signals conforming to existing television standards (i.e., NTSC and PAL). Another standard, known as Main Profile, High Level is intended for coding high-definition television images. Images encoded according to the Main Profile, High Level standard may have as many as 1,152 active lines per image frame and 1,920 pixels per line.
The Main Profile, Main Level standard, on the other hand, defines a maximum picture size of 720 pixels per line and 567 lines per frame. At a frame rate of 30 frames per second, signals encoded according to this standard have a data rate of 720*567*30 or 12,247,200 pixels per second. By contrast, images encoded according to the Main Profile, High Level standard have a maximum data rate of 1,152*1,920*30 or 66,355,200 pixels per second. This data rate is more than five times the data rate of image data encoded according to the Main Profile Main Level standard. The standard for HDTV encoding in the United States is a subset of this standard, having as many as 1,080 lines per frame, 1,920 pixels per line and a maximum frame rate, for this frame size, of 30 frames per second. The maximum data rate for this standard is still far greater than the maximum data rate for the Main Profile, Main Level standard.
The MPEG-2 standard defines a complex syntax which contains a mixture of data and control information. Some of this control information is used to enable signals having several different formats to be covered by the standard. These formats define images having differing numbers of picture elements (pixels) per line, differing numbers of lines per frame or field and differing numbers of frames or fields per second. In addition, the basic syntax of the MPEG-2 Main Profile defines the compressed MPEG-2 bit stream representing a sequence of images in five layers, the sequence layer, the group of pictures layer, the picture layer, the slice layer, and the macroblock layer. Each of these layers is introduced with control information. Finally, other control information, also known as side information, (e.g. frame type, macroblock pattern, image motion vectors, coefficient zig-zag patterns and dequantization information) are interspersed throughout the coded bit stream.
Format conversion of encoded high resolution Main Profile, High Level pictures to lower resolution Main Profile, High Level pictures; Main Profile, Main Level pictures, or other lower resolution picture formats, has gained increased importance for a) providing a single decoder for use with multiple existing video formats, b) providing an interface between Main Profile, high level signals and personal computer monitors or existing consumer television receivers, and c) reducing implementation costs of HDTV. For example, conversion allows replacement of expensive high definition monitors used with Main Profile, High Level encoded pictures with inexpensive existing monitors which have a lower picture resolution to support, for example, Main Profile, Main Level encoded pictures, such as NTSC or 525 progressive monitors. One aspect, down conversion, converts a high definition input picture into lower resolution picture for display on the lower resolution monitor.
To effectively receive the digital images, a decoder should process the video signal information rapidly. To be optimally effective, the decoding systems should be relatively inexpensive and yet have sufficient power to decode these digital signals in real time. Consequently, a decoder which supports conversion into multiple low resolution formats must minimize processor memory.
The MPEG-2 Main Profile standard defines a sequence of images in five levels: the sequence level, the group of pictures level, the picture level, the slice level, and the macroblock level. Each of these levels may be considered to be a record in a data stream, with the later-listed levels occurring as nested sub-levels in the earlier listed levels. The records for each level include a header section which contains data that is used in decoding its sub-records.
Each macroblock of the encoded HDTV signal contains six blocks and each block contains data representing 64 respective coefficient values of a discrete cosine transform (DCT) representation of 64 picture elements (pixels) in the HDTV image.
In the encoding process, the pixel data may be subject to motion compensated differential coding prior to the discrete cosine transformation and the blocks of transformed coefficients are further encoded by applying run-length and variable length encoding techniques. A decoder which recovers the image sequence from the data stream reverses the encoding process. This decoder employs an entropy decoder (e.g. a variable length decoder), an inverse discrete cosine transform processor, a motion compensation processor, and an interpolation filter.
FIG. 1 is a high level block diagram of a typical video decoding system of the prior art which processes an MPEG-2 encoded picture. The general methods used to decode an MPEG-2 encoded picture, without subsequent processing, downconversion or format conversion, are specified by the MPEG-2 standard. The video decoding system includes an entropy decoder (ED) 110, which may include variable length decoder (VLD) 210 and run length decoder 212. The system also includes an inverse quantizer 214, and inverse discrete cosine transform (IDCT) processor 218. A controller 207 controls the various components of the decoding system responsive to the control information retrieved from the input bit stream by the ED 110. For processing of prediction images, the system further includes a memory 199 having reference frame memory 222, summing network 230, and motion compensation processor 206a which may have a motion vector processor 221 and half-pixel generator 228.
The ED 110 receives the encoded video image signal, and reverses the encoding process to produce macroblocks of quantized frequency-domain (DCT) coefficient values and control information including motion vectors describing the relative displacement of a matching marcoblock in a previously decoded image which corresponds to a macroblock of the predicted picture that is currently being decoded. The Inverse Quantizer 214 receives the quantized DCT transform coefficients and reconstructs the quantized DCT coefficients for a particular macroblock. The quantization matrix to be used for a particular block is received from the ED 110.
The IDCT processor 218 transforms the reconstructed DCT coefficients to pixel values in the spatial domain (for each block of 8.times.8 matrix values representing luminance or chrominance components of the macroblock, and for each block of 8.times.8 matrix values representing the differential luminance or differential chrominance components of the predicted macroblock).
If the current macroblock is not predictively encoded, then the output matrix values provided by the IDCT processor 218 are the pixel values of the corresponding macroblock of the current video image. If the macroblock is interframe encoded, the corresponding macroblock of the previous video picture frame is stored in memory 199 for use by the motion compensation processor 206. The motion compensation processor 206 receives a previously decoded macroblock from memory 199 responsive to the motion vector, and then adds the previous macroblock to the current IDCT macroblock (corresponding to a residual component of the present predictively encoded frame) in summing network 230 to produce the corresponding macroblock of pixels for the current video image, which is then stored into the reference frame memory 222 of memory 199.