1. Field of the Invention
The present invention relates generally to the processing of video images, and more particularly to an adaptive and accurate method and apparatus for identifying the source type and quality level of a video sequence
2. Description of the Related Art
The world's major television standards use a raster scanning technique known as “interlacing.” Interlaced scanning draws horizontal scan lines from the top of the screen to the bottom of the screen in two separate passes, one for even numbered scan lines and the other for odd numbered scan lines. Each of these passes is known as a field.
Many of the sources of moving images which are used to produce interlaced video images are actually progressively scanned in nature. A single image or frame of a progressively scanned video source is scanned from the top of the screen to the bottom in a single pass. Common examples of such progressive sources are film and computer graphics.
In order for such a progressively scanned image to be utilized by an interlaced video system, it must be converted from a progressive scan format to an interlaced format. There are a number of standard techniques for performing this conversion. FIG. 1 illustrates a technique known as “3:2 pulldown”, wherein a 24 frame/second progressive film source 2 is converted to a 60 field/second interlaced video sequence generally designated 4. In this technique, a pair of film frames 6 and 10 is converted to five video fields 8 and 12. A first frame 6 is converted to three fields 8 and a second frame 10 is converted to two fields 12—hence the term “3:2 pulldown”. A similar technique is used when the ratio of progressive source frames to interlaced video fields is two to one. This is termed “2:2 pulldown” and is illustrated in FIG. 2, wherein a progressive source 14 is converted to a sequence of interlaced fields generally designated 16. Each source frame 18 is converted to two fields 20. The 2:2 pulldown technique is typically used in the transfer of 30 frame/second video or computer graphics to video standards which use a 60 field/second rate (e.g., NTSC), or in the transfer of film to video standards which use a 50 field/second rate (e.g., PAL and SECAM).
It is often necessary and/or desirable to convert an interlaced video signal to a progressive signal. Such conversion is advantageous, for example, when using a progressive-only display (a computer monitor), to eliminate the deficiencies of interlaced video formats (flicker, line twitter, or visible scan line structure), or to improve the compression ratio of a digital format (digital broadcast or satellite transmission). Unfortunately, many techniques for deinterlacing the video signal add a number of objectionable artifacts to the resultant progressive image. However, if an interlaced video signal can be identified as having originally come from a progressive source, then it is possible to reconstruct the original image frames without such deinterlacing artifacts. Techniques for identifying progressive sources analyze the video stream to detect patterns indicative of progressive sources. Both 3:2 pulldown and 2:2 pulldown have distinctive signature patterns which can be detected.
FIG. 3 illustrates a technique for identifying 3:2 pulldown sequences. A computation 22 is performed to yield an overall numeric value 24 indicative of the difference between pairs of fields 26 spaced two field periods apart. One numeric value (or field difference value) is produced during each field period. This field difference value is compared to a threshold value to determine if the fields being compared are similar or different. As shown in FIG. 3, in a 3:2 pulldown sequence there is a duplicated field 28 which occurs every five field periods. The presence of this duplicate field produces a pattern 30 which repeats over five field periods in which one field comparison yields a small field difference value (similar fields) while the remaining four field comparisons yield large field difference values (different fields). This pattern of low/high/high/high/high etc. is distinctive of 3:2 pulldown sequences.
FIG. 4 illustrates a technique for identifying 2:2 pulldown sequences. A computation 32 is performed on adjacent field pairs 34 which indicates the presence of a specific vertical high frequency component present in interlace motion artifacts. A relatively large frequency detection value 36 is produced for fields 38 which come from different original source frames 40 (since there can be interlace motion artifacts present between these frames), while a relatively low frequency detection value 42 is produced for field pairs 44 which come from the same original source frame 46. A threshold value is used to determine if the computed frequency detection value is ‘high’ or ‘low’. A 2:2 pulldown sequence produces an alternating high/low/high/low pattern 48.
The methods described above suffer from a number of problems when used with imperfect real-world sources. For example, the field differencing technique for 3:2 pulldown identification works well with high quality sources since a comparison of identical fields yields a zero or very low comparison value. A sequence of field difference values generally designated 50 for such a source is shown in FIG. 5. In this example, the field difference values generally designated 52 for ‘identical’ fields are consistently very low while the field difference values generally designated 54 for other field comparisons are relatively high. This allows a simple, fixed comparison threshold 56 to easily distinguish between similar and different fields. However, and with reference to FIG. 6, for noisy sources a comparison of ‘identical’ fields no longer yields such a low value since noise causes the computation to return a ‘low’ value 60 that exceeds the threshold 62.
A number of other problems are present in real-world video sources. There are many possibilities for injection of noise into a signal, including poor Y/C separation, imperfect storage media, transmission noise, and digital compression artifacts. Other factors which can make identification of the source type difficult include video edits made without regard to a progressive source sequence, transitions between different source types, frame rate conversion of a video sequence, and mixing of multiple or unsynchronized source types within a single video frame.
Inaccurate identification of the source type due to these factors can have a number of negative consequences. These include the presence of interlace motion artifacts when the source is incorrectly determined to be progressive or when a source-type transition is missed, and unnecessary deinterlace processing artifacts (such as reduced vertical resolution) when the source is incorrectly determined to be interlaced. What is needed is a robust detection method and apparatus which not only correctly identifies the source type, but which also determines the overall quality level of a source so that deinterlace processing can be correctly tailored to each source.