When a film or other audiovisual recording is transferred from its original format to a compressed format, it is often converted from one frame rate to another. For example, a motion picture is typically recorded at 24 frames per second (fps) in progressive format, but may be converted to 30 fps for distribution on DVD format or for television broadcast, typically using interlaced displays. An original recording may also be made at other frame rates, such as home video recordings which are typically made at 30 fps in interlaced format. Prior to encoding, an original recording may also be preprocessed, for example to perform noise reduction or frame rate conversion, and edited, for example to insert scene changes.
To compensate for the disparity between the original recording's frame rate and the rate at which it may later be displayed, various techniques of repeating and/or dropping portions of frames are used. The most common technique, used to convert from 24 fps progressive to 30 fps interlaced, is the “3-2 pulldown.” Each original progressive frame is first converted to a set of two fields. For every other group of two fields one field is repeated, resulting in a group of three fields followed by a group of two fields, i.e., a 3-2 pattern. The resulting video sequence can then be displayed at 30 fps on an interlaced display device without introducing visual artifacts. Various other conversion techniques may be used.
As a specific example, FIG. 1 shows the standard 3-2 pulldown method as it is used in the art. Although the 3-2 pulldown is shown as an example, embodiments of the present invention also may be used with other conversion methods. In FIG. 1, an original audiovisual recording (100) is made of a series of frames 110, 120, 130, . . . 190. To perform a 3-2 pulldown, each frame is first split into a pair of fields (101). For example, frame 110 is split into an even field 110A and an odd field 110B. The fields may be formed, for example, by splitting each frame into many horizontal rows. The even field 110A is then formed of the even-numbered rows; the odd field 110B is similarly formed of only the odd-numbered rows. Each frame is similarly split into a pair of fields: frame 120 is split into 120A and 120B, 130 into 130A and 130B, 140 into 140A and 140B, and so on. Displaying an even “A” field and an odd “B” field in rapid succession or simultaneously causes a complete frame to be displayed. An interlaced display displays the fields in rapid succession; a progressive display displays the fields simultaneously. Each field may be referred to as having an even or odd “parity.” Two even fields or two odd fields may be described as having the same parity, while an even field and an odd field may be described as having opposite parity.
To form the video stream using the 3-2 pulldown, the fields are arranged in the order shown at 102. One field from every other frame is repeated, such that the non-repeated field is preceded and followed by a copy of the repeated field. Field 111A is a copy of field 110A; field 131B is a copy of field 130B. In the field order 102 shown in FIG. 1, repeated fields are indicated by bold outlines.
A given conversion technique will result in a “cadence” in the video stream that can be detected during a pre-processing stage prior to encoding, or during a post-processing stage after decoding. When a video stream is encoded it may be desirable for the encoder to identify repeated fields reliably and consistently, to allow the encoder to avoid encoding multiple copies of the same field and accurately identify repeated fields in the encoded stream. Pulldown correction may be done prior to encoding to avoid encoding the repeated fields and optimize processing time and bit rate utilization, resulting in overall higher encoding quality. In such a situation, the repeated fields are marked as such in the compressed stream, instead of being encoded.
However, cadence detection and pulldown correction may be inaccurate due to noise in the original video sequence, which can lead to incorrect processing. In addition, an encoder may be instructed to encode every field, regardless of whether it is a repeated field. In some cases, an encoder may insert flags into the video stream to indicate when a field is repeated, allowing a decoder to avoid decoding the same field twice. Such methods may be error-prone if the encoder incorrectly identifies repeated fields or does not mark fields consistently.
When decoding a stream generated by an encoder that did not perform pulldown correction or performed pulldown correction incorrectly, it may be desirable for a decoder to reliably identify repeated fields regardless of the presence or absence of repeated-field indicators in the stream. A decoder capable of performing pulldown correction may retrieve the original progressive content, thus reducing visual artifacts resulting from improper matching of fields when the video is displayed on a progressive device. By properly identifying and dropping appropriate fields in a video stream, visual artifacts may be reduced.
There are thus several applications where it would be useful to have improved detection of repeated fields and cadences in a video stream.