This invention relates in general to digital video signal processing and more particularly to a method and apparatus for de-interlacing video fields to progressive scan video frames using motion adaptive techniques.
The NTSC and PAL video standards are in widespread use throughout the world today. Both of these standards make use of interlacing in order to maximize the vertical refresh rate thereby reducing wide area flicker, while minimizing the bandwidth required for transmission. With an interlaced video format, half of the lines that make up a picture are displayed during one vertical period (e.g. the even field), while the other half are displayed during the next vertical period (e.g. the odd field) and are positioned halfway between the lines displayed during the first period. While this technique has the benefits described above, the use of interlacing can also lead to the appearance of artifacts such as line flicker and visible line structure.
It is well known in the prior art that the appearance of an interlaced image can be improved by converting it to non-interlaced (progressive) format and displaying it as such. Moreover, many newer display technologies, for example Liquid Crystal Displays (LCDs), are non-interlaced by nature, therefore conversion from interlaced to progressive format is necessary before an image can be displayed at all.
Numerous methods have been proposed for converting an interlaced video signal to progressive format. For example, linear methods have been used whereby missing pixels in the progressive output sequence are generated as a linear combination of spatially and/or temporally neighbouring pixels from the interlaced input sequence, such as described in U.S. Pat. No. 6,266,092 (Wang). Although this approach may produce acceptable results under certain conditions, the performance generally represents a trade off between vertical spatial resolution and motion artifacts. Rather than accept this compromise, it is possible to achieve enhanced performance by employing a method that is capable of adapting to the type of source material. For instance, it is well known that conversion from interlaced to progressive format can be accomplished with high quality for sources that originate from motion picture film or from computer graphics (CG). Such sources are inherently progressive in nature, but are transmitted in interlaced format in accordance with existing video standards. For example, motion picture film created at 24 frames per second is converted to interlaced video at 60 fields per second using a process known as 3:2 pull down, where 3 fields are derived from one frame and 2 are derived from the next, so as to provide the correct conversion ratio. Similarly, a computer graphics sequence created at 30 frames per second is converted to interlaced video at 60 fields per second using a pull down ratio of 2:2, where 2 fields are derived from each CG frame. By recognizing that a video sequence originates from a progressive source, it is possible for a format converter to reconstruct the sequence in progressive format exactly as it was before the conversion to interlaced format.
For video that is not derived from a progressive source, there are other alternatives to linear processing. For instance, U.S. Pat. No. 4,989,090 (Campbell) describes one approach to a technique generally referred to as motion adaptive de-interlacing. In this method, missing pixels are generated in one of two different ways depending on whether motion is detected in the vicinity of the missing pixel. If little or no motion is detected, then the missing pixel is derived primarily from its temporal neighbours, thereby giving the best vertical resolution for static portions of the image. If a higher amount of motion is detected, then the missing pixel is derived primarily from its vertical neighbours, thereby avoiding motion artifacts, albeit at the expense of vertical resolution. Depending on the degree of motion detected, the missing pixel may be derived using a greater or lesser contribution from its temporal neighbours and vertical neighbours. This technique is used today in numerous consumer electronic systems. It should be noted that, in this specification, the term xe2x80x9cdegree of motionxe2x80x9d includes the absence of motion.
In order to achieve adequate performance in the above system, it is necessary for the system to derive the missing pixel from its vertical neighbors even when only very small amounts of motion are detected. This is necessary to avoid motion artifacts commonly referred to as xe2x80x9cfeatheringxe2x80x9d which result when pixels from different fields are erroneously combined in the presence of motion. Since the result obtained when interpolating between vertical neighbours, as in the motion case, may be quite different from that obtained when using the temporal neighbours, as in the static case, certain artifacts may be produced as a result of a transition between the two cases. The artifacts are a result of the property that a small change in an otherwise static portion of the image may produce a much larger change at the output. Consequently, noise may be amplified and vertical detail may tend to scintillate in the presence of subtle motion. These artifacts are inherent in such systems and may be reduced but not completely eliminated.
The preceding problem may be partially alleviated by transitioning smoothly between the motion and static cases with varying degrees depending on the level of motion detected. In order to avoid the feathering artifacts described above, experimental observation has shown that the initial transition towards the motion case must begin for small amounts of motion (in the range of 2-3% of full scale value) and that full transition to the motion case must occur shortly thereafter (in the range of 5-10% of fall scale value). Therefore, the function that relates the weightings of the static and motion cases to the measured motion value will have high gain in the transition region. Hence, the system will, to a large degree, still possess the property that a small change at the input may produce a much larger change at the output. This is a property which, as described earlier, can lead to noise and scintillation artifacts. It is an objective of the present invention to provide a method to alleviate the problems associated with the critical transition region of motion adaptive de-interlacers.
The following patents are relevant as prior art relative to the present invention:
According to the present invention, a method and apparatus are provided for motion adaptive de-interlacing of interlaced signals with greater immunity to noise and scintillation artifacts than is commonly associated with prior art solutions. In the present invention, vertical interpolation, which the prior art employs in the presence of motion, is replaced by a two-dimensional, non-separable, vertical-temporal interpolation filter with specific frequency characteristics. The vertical-temporal filter is designed such that for static image portions, the contribution from the current field (the field for which the missing pixel is being derived) is enhanced by a contribution from one or more adjacent fields so as to provide an estimate for the missing pixel which is a better approximation to that which would have been calculated using temporal interpolation as normally employed in the absence of motion. The fact that the estimate for the missing pixel will be similar for static portions of the image regardless of whether vertical-temporal or temporal interpolation is used, reduces the artifacts associated with the transition between the two processing modes. Furthermore, the vertical-temporal filter is designed such that feathering artifacts in moving portions of the image are avoided.
Although vertical-temporal interpolation outperforms vertical interpolation for static portions of an image, vertical interpolation is generally better than vertical-temporal interpolation for areas of high motion, since the adjacent field component associated with the latter comes from an area of the image that may be uncorrelated to the current field component. Thus it may not be obvious to a person of ordinary skill in the art that vertical-temporal interpolation would be preferred over vertical interpolation in the presence of motion, as provided by the method of the present invention. However, the benefit of this approach derives from the fact that there is a finite range of velocities for which vertical-temporal interpolation offers an advantage over vertical interpolation. Since the transition between the static and motion modes needs to occur even for small amounts of motion, vertical-temporal interpolation is utilized in cases where it offers an advantage. The lesser performance of vertical-temporal interpolation at higher levels of motion is generally acceptable to the human eye.