1. Field of the Invention
The present disclosure relates generally to methods of detecting the film mode of a video sequence and, in particular, to discrimination between progressive and non-progressive (or video) film modes in a video sequence.
2. Description of the Related Art
Multimedia processing systems, such as video encoders, may encode multimedia data using encoding methods based on international standards such as MPEG-x and H.26x standards. Such encoding methods generally are directed to compressing the multimedia data for transmission and/or storage and may combine both progressive and interlaced (non-progressive) sequences. Compression is broadly the process of removing redundancy from the data. In addition, video display systems may transcode or transform multimedia data for various purposes such as, for example, to ensure compatibility with display standards such as NTSC, HDTV, or PAL, to increase frame rate in order to reduce perceived motion blur, and to achieve smooth motion portrayal of content with a frame rate that differs from that of the display device. These transcoding methods may perform similar functions as the encoding methods for performing frame rate conversion, de-interlacing, etc.
A video signal may be described in terms of a sequence of pictures, which include frames (an entire picture), or fields (e.g., an interlaced video stream comprises fields of alternating odd or even lines of a picture). A frame may be generally used to refer to a picture, a frame or a field. Multimedia processors, such as video encoders, may encode a frame by partitioning it into blocks or “macroblocks” of, for example, 16×16 pixels. The encoder may further partition each macroblock into subblocks. Each subblock may further comprise additional subblocks. For example, subblocks of a macroblock may include 16×8 and 8×16 subblocks. Subblocks of the 8×16 subblocks may include 8×8 subblocks, and so forth. Depending on context, a block may refer to either a macroblock or a subblock, or even a single pixel.
Video sequences may be received by a receiving device in a compressed format and subsequently decompressed by a decoder in the receiving device. Video sequences may also be received in an uncompressed state. In either case, the video sequence is characterized at least by a frame rate, and a horizontal and vertical pixel resolution. Many times, a display device associated with the receiving device may require a different frame rate, and/or pixel resolution, and video reconstruction of one or more video frames may be performed. Reconstruction of video frames may comprise estimating a video frame between two or more already received (or received and decompressed) video frames. Furthermore, decoder devices may create new video data based on already reconstructed video data.
Frame rate conversion by pulldown is one example of new video data creation. Pulldown comprises repeating source frames in a known pattern to generate an output video signal which possesses more frames than the original. For example, when film is transferred to video, 24 frames per second of film must be converted to 60 fields per second of video by “stretching” four frames of film to fill five frames of video. For in an NTSC frame, there are actually two complete fields, for each frame displayed, resulting in ten fields for every four film frames. In 3:2 pulldown, for example, one film frame is placed across three fields, the next across two, the next across three, and so on. The cycle repeats itself completely after four film frames have been exposed. In interlaced (non-progressive) pulldown, the two fields comprise the even and odd interlaced fields, while in progressive pulldown, the two fields comprise the complete frame.
In order to display a high quality progressive image from a video stream, the decoder needs to figure out which fields in the video stream go together to make each film frame. However, the video stream may contain frames generated by progressive and interlaced pulldown. As a result, the decoder may combine together two fields that weren't meant to go together, creating distortion in the image which is displayed.
From the forgoing, then, there is a need for systems and methods of detecting the pulldown in a video stream.