The present invention relates generally to image decompression using 3:2 pull-down, and more particularly to decoding and display of B-pictures using 3:2 pull-down.
Multimedia applications of digital processing, storage, and communication systems have been advancing at a rapid pace, as can be seen from the evolution of the standards for representing compressed video.
The MPEG-1 (Moving Picture Expert Group--1) International Standard (ISO/IEC 11172), approved November 1992, was developed mainly for continuous transfer rates of about 1.5 Mbit/s (megabit/second), although it has a large degree of flexibility. As such, it offers good results at spatial resolutions of about 350 pels (picture elements--samples of image data) horizontally by about 250 pels vertically when the picture rate is about 24 to 30 progressive (non-interlaced) pictures per second (pictures/s).
The MPEG-2 Draft International Standard (ISO/IEC 13818-2), adopted March 1994, provides for enhanced capabilities, and its main profile at main level (MP@ML) syntax subset with constrained parameters is specified for digital TV at International Radio Consultative Committee (CCIR) 601 resolution: interlaced at 720.times.480 @ 30 frames/s for NTSC (National Television System Committee) and 720.times.576 @ 25 frames/s for PAL/SECAM (phase alternation line / sequential color with memory). More details of the digital television parameters can be found in CCIR Recommendation 601-2, "Encoding Parameters of Digital Television for Studios" (1982, 1986, 1990).
Images displayed on a TV screen are a sequence of frames. Each frame is divided in two fields: the top field consists of all odd lines and the bottom field consists of all even lines. The two fields are displayed interlaced at intervals equal to one-half of a frame period. For NTSC the frame period is approximately 30 Hz and for PAL/SECAM it is 25 Hz.
A source material, such as a movie, that is progressive with 30 pictures/s may be converted to NTSC by separating each picture (frame) in the two fields that are displayed interlaced.
Many movies are progressive with 24 pictures/s. Such movies are converted to NTSC using 3:2 pull-down as follows. Each picture (frame) is separated into the two fields. Half of the pictures are displayed for two fields and the other half (every other picture) are displayed for three fields, with top and bottom fields always alternating. For example, if a movie consists of pictures P0, P1, P2 and P3, each picture would be separated in the top and bottom fields: T0, B0, T1, B1, T2, B2, T3 and B3. Pictures P0 and P2 would be displayed for two fields, and pictures P1 and P3 would be displayed for three fields, as follows: T0, B0, T1, B1, T1, B2, T2, B3, T3, B3. Note that the bottom field was displayed before the top field in picture P2. Repeating this procedure, every second 24 pictures are displayed: 12 pictures are displayed for two fields and 12 for three fields - a total of 60 fields per second (fields/s), which is the NTSC field rate.
Using a generalized pull-down, for which any picture can be displayed for two or three fields, the ratio between the display frame rate and the picture rate can be any number between 1 and 1.5. Using this generalized pull-down, progressive movies with 20-30 pictures/s can be displayed using NTSC and progressive movies with 16.67-25 pictures/s can be displayed using PAL/SECAM.
This pull-down feature is also used in MPEG decoders. MPEG-1 sequences and progressive sequences in MPEG-2 use it to display on NTSC and PAL/SECAM displays movies that have been progressive encoded (converted into a bitstream). The MPEG standards define a bitstream syntax and specify how a compliant decoder should process this bitstream to extract the audio and video information. The display of video data is beyond the scope of the MPEG standards, but any practical MPEG application that displays on a television set or on an interlaced monitor needs a display unit that implements the pull-down feature.
The MPEG-1 bitstream represents progressively scanned images and does not specify interlace or 3:2 pull-down. The pull-down feature must be carried out autonomously by the decoder and display unit. The MPEG-2 bitstream may represent both progressive and interlaced sequences. The progressive frames include two flags, top.sub.-- field.sub.-- first and repeat.sub.-- first.sub.-- field, to direct the way in which 3:2 pull-down is carried out.
Both MPEG-1 and MPEG-2 divide each picture into 16.times.16 square blocks of pels, called macroblocks, which are encoded one-by-one using a combination of frequency transformation, quantization and entropy coding. Motion compensation and predictive coding are also frequently used to achieve the high compression necessary. Motion compensation may be done using both a past picture and a future picture encoded before its appropriate position in the display sequence. Thus the decoder must be able to store two full pictures. The pictures whose decoding requires bidirectional prediction (i.e. both forward and backward prediction) are called B-pictures. B-pictures are not used to predict other pictures.
In order to provide some immunity to data corruption, the macroblocks of each picture are grouped into slices, which are sequences of successive macroblocks. Each slice may be decoded independent of the information in the previous slices. MPEG-1 does not restrict the distribution of macroblocks into slices. MPEG-2 requires that a new slice begin at the beginning of each row of macroblocks of a picture.
The above-mentioned MPEG-2 Main Profile at Main Level subset, discussed above, was defined in such a way that a 16 megabit (Mbit) memory is sufficient to decode the bitstream. Until recently, the organization of the memory was seen as:
720.times.576.times.8.times.1.5=4860 Kbits for the forward prediction picture PA1 720.times.576.times.8.times.1.5=4860 Kbits for the backward prediction picture PA1 720.times.576.times.8.times.1.5=4860 Kbits for the frame being decoded and displayed PA1 1792 Kbits for the video buffer PA1 12 Kbits for the system buffer PA1 4 Kbits for the audio buffer
All these numbers add up to exactly 16 Mbits. No margin was provided.
It has been realized that additional buffering of at least 124 Kbits is needed for the system transport buffers, for demultiplexing and for packet overhead. If the 3:2 pull-down feature is not supported, it has been known heretofore how to decode and display a B-picture without storing it completely in the memory, so the additional 124 Kbits of buffers can be accommodated. For NTSC there are only 480 lines in a frame, so there is available memory for the 124 Kbits while supporting 3:2 pull-down by completely storing B-pictures as they are decoded and displayed. The size of the memory constitutes a problem for PAL/SECAM if the 3:2 pull-down feature is supported. This important feature would be very costly to implement for PAL/SECAM while storing the entire B-picture as has been known heretofore. The cost is due to the fact that standard memories are available only in certain sizes, and increasing the memory beyond 16 Mbits would also add to the complexity and power consumption of the memory controller.
Accordingly, it is an object of the present invention to reduce the frame buffer required for decoding and displaying a picture using 3:2 pulldown below the size required to hold image information for one full frame.
Additional objects and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objects and advantages of the invention may be realized and obtained by means of the instrumentalities and combinations particularly pointed out in the claims.