The present invention relates to digital image processing and more particularly to a method and apparatus for converting interlaced video fields into progressively scanned frames.
World wide video standards such as NTSC, PAL, and SECAM use interlaced video formats to maximize the vertical refresh rates while minimizing the required transmission bandwidth. In an interlaced video format, a picture frame is divided into fields, as shown in FIGS. 1A and 1B, which depicts a picture frame 11 divided into three (3) exemplary video fields 10. In each video field, e.g., 10a, a plurality of pixels 14 are arranged in scan lines 12a. The pixels 14 in one half of scan lines 12a of a picture frame are displayed on the screen during the first vertical scan period (i.e., the odd field 10a), while the pixels 14 in the other half of scan lines 12b, positioned halfway between those displayed during the first period, are displayed during the second vertical scan period (i.e., the even field 10b). While using an interlaced format provides the benefits mentioned above, it also can produce undesirable visual artifacts, such as line flicker and line crawling.
The visual artifacts can be minimized and the appearance of an interlaced image can be improved by converting it to a non-interlaced (progressive) format and displaying it as such. In fact, many newer display technologies, such as for example Liquid Crystal Displays (LCDs) and Plasma Display Panels (PDP), are designed to display progressively scanned video images, i.e., non-interlaced.
A conventional progressive video signal display system, e.g., a television (TV) or a projector, is illustrated in FIG. 1C. As is shown the display system 20 includes a signal receiving unit 22 that is coupled to a tuner box 24, and a video decoder 28. Signals, such as television signals, are captured by the signal receiving unit 22 and transmitted to the tuner box 24. The tuner box 24 includes a converter 25 and a demodulation unit 26 that transforms the incoming signal into an analog signal. The analog signal 27 is received by the video decoder 28, which outputs an interlaced video signal 29. As stated above, if the interlaced video signal 29 is displayed, undesirable visual artifacts, such as line flicker and line crawling, exist. Accordingly, a de-interlacer 30 is used to convert, i.e., de-interlace, the interlaced video signal 29 to generate a progressive video output signal 32. The progressive video output signal 32 is then displayed via an LCD or PDP 34.
Numerous methods have been proposed for de-interlacing an interlaced video signal to generate a progressive video signal. For instance, some methods perform a simple spatial-temporal de-interlacing technique, such as line repetition and field insertion. These methods, however, do not necessarily take into consideration motion between or within fields. For instance, it is well known that while line repetition is adequate for image regions having motion, line repetition is not suitable for stationary (still) image regions. By the same token, field insertion is a satisfactory de-interlacing method for stationary image regions, but inadequate for moving image regions. Therefore, utilizing one method presents a tradeoff between vertical spatial resolution and motion artifacts
To address this issue, some de-interlacing methods are motion adaptive, i.e., they take into consideration the motion from field to field and/or from pixel to pixel in a field. Motion adaptive de-interlacing methods can dynamically switch or fade between different de-interlacing methods, such as between line repetition and field insertion. Per-field motion adaptive de-interlacing methods select a de-interlacing technique on a field-by-field basis. Thus, per-field de-interlacing methods do not maintain the overall quality throughout an image when there are both stationary and moving regions on it. Whereas, per-pixel de-interlacing methods select a de-interlacing technique on a pixel-by-pixel basis, thus providing a much better overall quality throughout an image.
Yet more de-interlacing methods are based on identifying the type of the source material from which the interlaced video signal was generated. For example, motion picture film or computer graphics (CG) signals are inherently progressive, i.e., non-interlaced. When the signals are transmitted for broadcasting, the signals are converted into interlaced video signals according to standards such as NTSC and PAL. Well known techniques such as 3:2 pull-down or 2:2 pull-down are used to break the original progressive frames into interlaced video fields while maintaining the correct frame rate. Progressively scanned video sources such as those shot by progressively scanned electronic cameras are inherently progressive in nature but are transmitted in interlaced formats according to standards such as NTSC and PAL, or via progressive segmented frame (PsF) transport in ITU-R BT.7094 standard. De-interlacing such signals originating from such non-interlaced (progressive) sources can be achieved with high quality if the original progressive frame sequences can be identified and reconstructed correctly. Thus, by recognizing that a video sequence originates from a progressive source, the original progressive frames can be reconstructed exactly by merging the appropriate video fields.
Unfortunately, most video transmission formats do not include explicit information about the type of source material being carried, such as whether the material was derived from a progressive source. Thus, in order for a video processing device to exploit the progressive nature of film, CG, or PsF sources, it is first necessary to determine whether the material originates from a progressive source. If it is determined that the material originates from such a source, it is furthermore necessary to determine precisely which video fields originate from which source frames.
Typically, the progressive nature of the source of the interlaced video signal can be determined by examining the motion between fields of an input video sequence. It is well known that a 3:2 pull-down conversion produces a characteristic motion pattern or cadence between same-parity fields, and that a 2:2 pull-down conversion produces another characteristic motion pattern between opposite-parity fields. Accordingly, when such a pattern or cadence is detected, the de-interlacer can enter a “progressive source mode”. Nevertheless, comparing motion between same-parity fields only (or opposite-parity fields only) can be unreliable and can result in false detections or missed detections. In both cases, the resulting progressive video output can exhibit undesirable visual artifacts because of inappropriate field merging (false detection) or non-optima; per-pixel interpolation (missed detection). This results in degraded video quality, e.g., feathering and loss of vertical resolution, for the progressive video output. Moreover, if the interlaced input video signal is derived from a conversion process other than a 3:2 or 2:2 pull down or if the source signal is more complex than a pure progressive frame, e.g., a film/video overlap, cross-fade, or split screen, the cadence based detection method cannot reliably detect the nature of the progressive source and the quality of the resultant progressive video output will suffer.
Accordingly there exists a need for an improved process and apparatus for converting an interlaced video signal originating from a progressive source into a progressively scanned video signal. The method and apparatus should be able to minimize visual artifacts resulting from motion and should be relatively easy to implement.