A standard for digital video and audio programs for broadcast and for recordings such as video compact disks (VCD) has been established by the Motion Pictures Expert Group (MPEG) chartered by the International Organization for Standardization (ISO). Such standards for digital video and two channel stereo audio were established and known as MPEG-1, more formally, as ISO-11172. An enhanced standard, known colloquially as MPEG-2 and more formally as ISO-13818, has been established to provide for enhanced quality and for specifying data formats for broadcast and other higher noise applications as well as digital video disks (DVD) and other higher resolution recorded media.
The MPEG video standard specifies a bitstream syntax that typically provides transformation blocks of 8.times.8 luminance pels (pixels) and corresponding chrominance data using Discrete Cosine Transform (DCT) coding. The DCT coding is performed on the 8.times.8 pel blocks followed by quantization, zigzag scan, and variable length coding of runs of zero quantized indices and amplitudes of the indices. Motion compensated prediction is employed. For video, MPEG contemplates Intra (I) frames, Predictive (P) frames and Bidirectionally Predictive (B) frames. The I-frames are independently coded and are the least efficiently coded of the three frame types. P-frames are coded more efficiently than are I-frames and are coded relative to the previously coded I- or P frame. B-frames are coded the most efficiently of the three frame types and are coded relative to both the previous and the next I- or P-frames. The coding order of the frames in an MPEG program is not necessarily the same as the presentation order of the frames. Headers in the bitstream provide information to be used by decoders to properly decode the time and sequence of the frames for the presentation of a moving picture.
The video bitstreams in MPEG systems include a Video Sequence Header containing picture size and aspect ratio data, bit rate limits and other global parameters. Various Sequence Extensions may also be included that contain other information applicable to all pictures of the sequence, including a Progressive Sequence bit which indicates that the sequence contains only Progressive Frame pictures, a Chrominance Format code, original video format (e.g., NTSC, PAL, other) and other variables. Following the Video Sequence Header are coded groups-of-pictures (GOPs). Each GOP usually includes only one I-picture and a variable number of P- and B-pictures. Each GOP also includes a GOP header that contains presentation delay requirements and other data relevant to the entire GOP. Each picture in the GOP includes a Picture Header that contains picture type and display order data and other information relevant to the picture within the picture group.
Each MPEG picture is divided into a plurality of Macroblocks (MBs), not all of which need be transmitted. Each MB is made up of 16.times.16 luminance pels, or a 2.times.2 array of four 8.times.8 transformed blocks of pels. MBs are coded in Slices of consecutive variable length strings of MBs, running left to right across a picture. Slices may begin and end at any intermediate MB position of the picture but must respectively begin or end whenever a left or right margin of the picture is encountered. Each Slice begins with a Slice Header that contains information of the vertical position of the Slice within the picture, information of the quantization scale of the Slice and other information such as that which can be used for fast-forward, fast reverse, resynchronization in the event of transmission error, or other picture presentation purposes.
The Macroblock is the basic unit used for MPEG motion compensation. Each MB contains an MB Header, which, for the first MB of a Slice, contains information of the MB's horizontal position relative to the left edge of the picture, and which, for subsequently transmitted MBs of a Slice, contains an address increment. Not all of the consecutive MBs of a Slice are transmitted with the Slice.
The presentation of MPEG video involves the display of video frames at a rate of, for example, twenty-five or thirty frames per second (depending on the national standard used, PAL or NTSC, for example). Thirty frames per second corresponds to presentation time intervals of approximately 32 milliseconds. The capacity of MPEG signals to carry information is achieved in part by exploiting the concept that there is typically a high degree of correlation between adjacent pictures and by exploiting temporal redundancies in the coding of the signals. Where two consecutive video frames of a program are nearly identical, for example, the communication of the consecutive frames requires, for example, only the transmission of one I-picture along with the transmission of a P-picture containing only the information that differs from the I-picture, or Reference Picture, along with information needed by the decoder at the receiver to reconstruct the P-picture from the previous I-picture. This means that the decoder must have provision for storage of the Reference Picture data.
Information contained in a P-picture transmission includes blocks of video data not contained in a Reference I- or P-picture, as well as data information needed to copy data into the current picture from a previously transmitted I- or P-picture. The technique used in MPEG systems to accomplish P-picture construction from a Reference picture is the technique of Forward Prediction in which a Prediction in the form of a Prediction Motion Vector (MV) is transmitted in lieu of the video data of a given or Target MB. The MV tells the decoder where and how to extract a 16.times.16 block of pixel data from the I- or P-Reference Picture to be reproduced as the Target MB. If needed, a Prediction Error is transmitted in the form of an error block that contains pixel data needed to supplement the copied motion compensated data in order to complete the current picture.
With B-pictures, the Bidirectional Temporal Prediction technique called Motion Compensated Interpolation is used. Motion Compensated Interpolation is accomplished by transmitting, in lieu of all of the video data for a Target MB, an MV that specifies which 16.times.16 block of pixels to copy either from the previous Reference Picture or from the next future Reference Picture, or from the average of one 16.times.16 block of pixels from each of the previous and next future Reference Pictures. By "previous" reference picture is meant a reference I- or P-picture that has already been displayed. By "future" reference picture is meant a reference P-picture that has yet to be displayed, but has been received before the current picture to permit the copying of data from it. With the motion vector, an Error Block of only the data, if any, that cannot be supplied by copying from the reference pictures is transmitted in pixel data form.
Motion compensation vectors in current MPEG P- and B-pictures specify relocation of pixel data to the nearest half pel. This requires that the MPEG decoders perform a half-pel interpolation of luminance and chrominance values from adjacent pixel data in a 16.times.16 sized block copied from the reference picture in order to arrive at the luminance and chrominance values for the pixels of the macroblock in the current picture. Typical MPEG video decoders carry out this half-pel interpolation upon the performance of the motion compensation as the current picture is being written to the output buffer. With standard resolution systems, the output macroblocks will have the same number of pixels as the reference macroblocks, so that after the half-pel interpolation, the original copied pixel values will be discarded. The resolution of the resulting current picture typically approaches that of the reference picture, but which may be a slightly degraded reproduction of the original picture. The addition of half-pel interpolation to motion compensation of video programs enhances the quality of the output when presented in the original resolution.
Many programs, broadcast and recorded, are or will be of standard DVD resolution. As HDTV systems are developed and deployed, there will be a substantial period of time during which such HDTV systems will be used to present programs transmitted or recorded in DVD resolution. Such presentations must multiply, typically double, the number of lines of the output pictures and multiply the number of pixels per line in order to fill the high resolution display. For example, the the increasing of the resolution may involve the duplication of pixel data of the video program to enlarge the 8.times.8 pixel video blocks to 16.times.16 pixel blocks that are sent to the display, sending four copies of each pixel in a 2.times.2 block to the display. With such a technique, the resolution of the output picture would be presented with, for example, four times as many pixels as the original data, but the resolution would not be generally improved.
There is a need in cases where video presentation systems have resolution capabilities greater than the resolution of incoming programs, including cases where half-pel interpolation is employed, to improve the resolution of the presented program to greater than that in which the program was encoded.