A movie, or motion picture, is a sequence of individual, still pictures (images). A plurality of images (pictures) are presented, one after the other, each differing slightly from the one before, to create the perception of motion. In both film (celluloid) and video (electronic signals), individual images are often referred to as “frames”. In the context of film, there are typically 24 frames per second (fps). In the context of video, there are typically 30 frames per second (fps). The sequence of video frames may be “progressive” or “interlaced”.
A progressive sequence has a single field in each frame. An interlaced sequence comprises two fields per frame—an odd field (“o”) and an even field (“e”)—thus, there are 60 fields (or half-resolution images) per second. (The two fields of an interlaced scan are also sometimes referred to as “top” and “bottom”.) In the main, hereinafter, interlaced video sequences are discussed.
It would be a relatively simple matter to convert a film to video, if the frame rates (24 fps and 30 fps, respectively) were the same. Every film frame (image) would generate one video frame, each having two interlaced fields. However, because the frame rates of film and video are not the same, a system was developed, called “3-2 Pulldown” (or “3:2 pulldown”, or “telecine”) to convert film to video.
The 3-2 pulldown process basically converts every four film frames (“A”, “B”, “C”, “D”) to five video frames (“1”, “2”, “3”, “4”, “5”), each video frame comprising two fields (odd/o, even/e). The resulting five video frames (ten video fields) are related to the original four film frames, as follows:
TABLE 1The 3:2 pull-down process4 Film Frames10 Fields5 Video FramesAAo1AeBBo2BeBo′3CCeCo4DDeDo5De′
Two video fields (Ao, Ae) are created for the first film frame (A). Three video fields (Bo, Be, Bo′) are created for the second film frame (B), the second odd field (Bo′—primed) being a copy of the first odd field (Bo—not primed). Two video fields (Ce, Co) are created for the third film frame (C). Three video fields (De, Do, De′) are created for the fourth film frame (D). Thus, the sequence of fields is Ao, Ae, Bo, Be, Bo′, Ce, Co, De, Do, De′, where “o” is odd, and “e” is even. The primed fields (second Bo, second De) are inserted/repeat fields. Note that the fields (odd/o, even/e) always alternate, and that the A-frame is the only frame in the sequence where a film frame (A) is completely reproduced on one and only one complete video frame (1).
The 3-2 pulldown process originally started with three fields (3-2-3-2), but it was soon discovered that starting with two fields (as in the Table above) produced better results, but the name was never changed (e.g., to “2-3 pulldown”). Hence, the 3-2-3-2 field configuration is actually normally a 2-3-2-3 field configuration.
MPEG-2 and “Film Mode”
One of the best known and most widely used video compression standards for encoding moving picture images (video) and associated audio is the MPEG-2 standard, provided by the Moving Picture Experts Group (MPEG). The MPEG-2 standard allows for the encoding of video over a wide range of resolutions, including higher resolutions commonly known as HDTV (high definition TV).
Prior to bitstream coding, a good encoder will eliminate redundant frames/fields from a 30 fps video signal which encapsulates an inherently 24 fps source. The MPEG decoder or display device will then repeat the frames or fields to recreate or synthesize the 30 frame/sec display rate. MPEG-2 provides specific picture header variables called repeat_first_field and top_field_first which explicitly signals (designates) which frames or fields are to be repeated, and how many times.
The film-mode process, often referred to as “3:2 pull-down reversal” or “inverse telecine,” automatically detects and eliminates any duplicate (repeated) fields. This method can yield up to 20 percent compression efficiency when compared to video-originated material, and is one of the most widely applied pre-processing techniques today. Detecting “bad edits” that may have been created during the post-production process reveals where the 3:2 sequence has been interrupted. This allows the pre-processing activity to drop out of film mode when necessary and re-enter it when a consistent 3:2 sequence is again detected.
If film material is edited after the 3:2 pull-down operation, a cut should only occur at the 5-frame boundaries in order to preserve film mode. If a cut does not occur at a 5-frame boundary, an encoder will drop out of film mode, and search for another film mode sequence. This search may take as much at 100 frames. During this time, the encoder will process the sequence in video mode, using more bandwidth or requiring more quantization than if the sequence was processed in film mode. If the encoder did not drop out of film mode immediately, there is danger that fields will be displayed out of temporal sequence.
FIG. 1 is a functional block diagram of a typical MPEG-2 encoder of the prior art, the major functional blocks of which are: motion compensation (MC), motion estimation (ME), discrete cosine transform (DCT), inverse discrete cosine transform (IDCT), quantization (QNT), inverse quantization (IQNT), macroblock-type processing (MBT), rate control (RC), and variable-length coding (VLC), all connected as shown. The encoder receives a video input and outputs a bitstream. As is known, the encoder performs discrete cosine transformation and quantization on the video data, and compresses the data with variable-length coding.
U.S. Pat. No. 5,929,902
U.S. Pat. No. 5,929,902 (“Kwok”) discloses method and apparatus for inverse telecine processing by fitting 3:2 pull-down patterns. FIG. 2, corresponding to FIG. 1 therein, is a block diagram of an exemplary video encoding system of the prior art.
FIG. 2 shows an exemplary video encoding system 212 in which a sequence of frames are supplied from a video source 214. The video source 214 may be any digital video signal source such as a video camera or a telecine machine. The video encoding system 212 further includes a video capture buffer 216 for capturing the input video sequence and an inverse telecine circuit 218. The inverse telecine circuit 218 detects repeat fields in the input video sequence and causes these fields to be dropped so as not to waste valuable encoder resources on the compressing of repeat fields. The video encoding system 212 further includes an encoder 220 which may be an MPEG-1 or MPEG-2 compliant encoder. The encoder 220 includes a preprocessor buffer 222, a preprocessor 224, a video compression circuit 226, a rate buffer 228 and a controller 230.
The video compression circuit 226 receives a video signal from the preprocessor 224 in the form of a sequence of frames or fields and outputs a compressed digital video bit stream. The compressed digital video bit stream output by the video compression circuit 226 may comply with the syntax specified in video compression standards such as MPEG-1 or MPEG-2. Compression circuits which generate an MPEG-1 or MPEG-2 compliant bit stream are well known. The video bit stream generated by the video compression circuit 226 is stored in the rate buffer 228. The bit stream is then transmitted via a transmission channel 232 to one or more decoders which decode the received bit stream. Alternatively, the bit stream may be transmitted to an electronic or magnetic memory, a recordable optical disk or another suitable storage device.
The controller 230 controls the number of bits allocated by the video compression circuit 226 to the frames to be encoded. The controller 230 allocates bits to the frames to be encoded so as not to exceed the bandwidth in the channel 232 assigned to the encoding system 212 and so as to maintain certain limits on the occupancy of the rate buffer 228. This is turn prevents overflow and underflow conditions when the bit stream is received in a decoder buffer from the transmission channel 232 or from a storage device in which the bit stream has been previously stored.
The preprocessor 224 processes the video signal so that it may be compressed by the video compression circuit 226. For example, the preprocessor 224 may change the format of each frame including the number of horizontal or vertical pixels to meet parameters specified by the video compression circuit 216. In addition, the preprocessor 224 can detect scene changes or other changes which increase compression difficulty.
Using the techniques of the present invention, described hereinbelow, in a system such as described with respect to FIG. 2, the preprocessor 224 can detect interruptions to the 3:2 pulldown sequence which indicate bad edits and, via the controller 230, make corrections to the bad edits to preserve film mode.