This invention relates in general to digital video signal processing and more particularly to a method and apparatus whereby motion between odd and even video fields may be reliably measured despite the presence of high vertical spatial frequencies.
The NTSC and PAL video standards are in widespread use throughout the world today. Both of these standards make use of interlacing in order to maximize the vertical refresh rate thereby reducing wide area flicker, while minimizing the bandwidth required for transmission. With an interlaced video format, half of the lines that make up a picture are displayed during one vertical period (i.e. the even field), while the other half are displayed during the next vertical period (i.e. the odd field) and are positioned halfway between the lines displayed during the first period. While this technique has the benefits described above, the use of interlacing can also lead to the appearance of artifacts such as line flicker and visible line structure.
It is well known in the prior art that the appearance of an interlaced image can be improved by converting it to non-interlaced (progressive) format and displaying it as such. Moreover, many newer display technologies, for example Liquid Crystal Displays (LCDs), are non-interlaced by nature, therefore conversion from interlaced to progressive format is necessary before an image can be displayed at all.
Numerous methods have been proposed for converting an interlaced video signal to progressive format. For example, linear methods have been used, where pixels in the progressive output image are generated as a linear combination of spatially and/or temporally neighbouring pixels from the interlaced input sequence.
Although this approach may produce acceptable results under certain conditions, the performance generally represents a trade off between vertical spatial resolution and motion artifacts. Instead of accepting a compromise, it is possible to optimize performance by employing a method that is capable of adapting to the type of source material. For instance, it is well known that conversion from interlaced to progressive format can be accomplished with high quality for sources that originate from motion picture film or from computer graphics (CG). Such sources are inherently progressive in nature, but are transmitted in interlaced format in accordance with existing video standards. For example, motion picture film created at 24 frames per second is converted to interlaced video at 60 fields per second using a process known as 3:2 pull down, where 3 fields are derived from one frame and 2 are derived from the next, so as to provide the correct conversion ratio. Similarly, a computer graphics sequence created at 30 frames per second is converted to interlaced video at 60 fields per second using a pull down ratio of 2:2, where 2 fields are derived from each CG frame. By recognizing that a video sequence originates from a progressive source, it is possible for a format converter to reconstruct the sequence in progressive format exactly as it was before its conversion to interlaced format.
Unfortunately, video transmission formats do not include explicit information about the type of source material being carried, such as whether the material was derived from a progressive source. Thus, in order for a video-processing device to exploit the progressive nature of film or CG sources, it is first necessary to determine whether the material originates from a progressive source. If it is determined that the material originates from such a source, it is furthermore necessary to determine precisely which video fields originate from which source frames. Such determination can be made by measuring the motion between successive fields of an input video sequence.
It is common to measure at least two different modes of motion in determining the presence of a film source. Firstly, it is common to measure the motion between a given video field and that which preceded it by two fields. In this case, motion can be measured as the absolute difference between two pixels at the same spatial position in the two fields. A measure of the total difference between the two fields can be generated by summing the absolute differences at the pixel level over the entire field. The quality of the motion signal developed in this way will be fairly high, since the two fields being compared have the same parity (both odd or both even) and therefore corresponding samples from each field have the same position within the image. Thus any difference that is measured between two pixels will largely be the result of motion. Although the quality of measurement made in this way is high, unfortunately it is of limited value. For an input sequence derived from film in accordance with a 3:2 pull down ratio, only one out of five successive measurements made in this way will differ significantly from the rest. The measure of motion between the first and third fields of the three fields that are derived from the same motion picture frame will be substantially lower than the measurements obtained during the other four fields, since the two fields being compared are essentially the same and differ only in their noise content. This does not provide sufficient information to avoid artifacts under certain conditions when a film sequence is interrupted. Also, in the case of an input sequence derived from film or CG in accordance with a 2:2 pull down ratio, no useful information is provided whatsoever.
A second mode of motion that can be measured is the motion between successive fields which are of opposite parity (one odd and one even). Although this mode of measurement overcomes the limitations of the above, it is inherently a more difficult measurement to make since a spatial offset exists between fields that are of opposite parity. Thus, even if there is no actual motion, a finite difference between the fields may exist owing to the spatial offset. This tends to increase the measured difference when there is no motion making it more difficult to reliably discriminate between when there is motion and when there is not. This is particularly true in the presence of noise and/or limited motion. A number of methods have been proposed in the prior art for the measurement of motion between fields of opposite parity. It is an objective of the present invention to provide a method for the measurement of motion between fields of opposite parity with greater ability to discriminate between the presence of motion or lack thereof than those of the prior art.
Various techniques besides those linear methods described above, have also been proposed for conversion from interlaced to progressive format of video material not derived from film. For example, if it can be determined whether specific parts of an image are in motion, then each part can be processed accordingly to achieve more optimal results. This requires the measurement of motion locally and is akin to the problem of measuring motion globally as required to determine the presence of film sources. The same elemental operations may be used to measure differences at a pixel level, only in the latter case the differences are summed over an entire field to produce a global measurement, whereas in the former case the difference may be used as a measure of local motion without further summation. As with the global case, the local case may involve various modes of measurement. One of the modes that can be used to advantage is the local measurement of motion between successive fields of opposite parity. It is a further objective of the present invention to provide such a method.
The following patents are relevant as prior art relative to the present invention:
According to the present invention, a method and apparatus are provided whereby the motion between two fields of opposite parity may be measured with greater ability to discriminate between the presence of motion and lack thereof than with those techniques of the prior art. According to the present invention, the level of motion between the two fields at a specific position is determined by comparing the values of four vertically adjacent pixels, each of which having the same horizontal position, where the first and third pixels are taken from vertically adjacent lines in one field, the second and fourth pixels are taken from vertically adjacent lines in the other field such that the vertical position of the second pixel is halfway between the first and third pixels and the vertical position of the third pixel is halfway between the second and fourth pixels. If the value of the second pixel lies between the values of the first and third pixels, or if the value of the third pixel lies between the values of the second and fourth pixels, then the local motion is taken as zero. Otherwise, the local motion is taken as the minimum of the absolute differences between the first and second pixels, the second and third pixels, and between the third and fourth pixels.
This technique has the benefit that false detection of motion arising from the presence of high vertical spatial frequencies is minimized, while actual motion is still readily detected. Using this technique, false detection is completely avoided for vertical spatial frequencies less than one half of the vertical frame Nyquist frequency. Utilizing more than four pixels extends the range of vertical spatial frequencies for which false detection is completely avoided irrespective of the vertical frame Nyquist frequency. In general, if the method of the present invention is scaled to utilize n pixels where n is greater than or equal to four, then false detection of motion is avoided for frequencies up to and including (nxe2x88x923)/(nxe2x88x922) of the vertical frame Nyquist frequency. In any case, the resulting local measurement of motion can either be used directly or summed over an entire field in order to provide a global motion signal that is useful for determining whether an input sequence derives from a film source.
According to a further aspect of the present invention, the contributing pixels are chosen such that their spatial positions remain constant regardless of whether the most recent of the two fields is even or odd. In this way, any motion that is falsely detected in a static image remains constant from one field to the next, thereby improving the ability to distinguish between falsely detected motion and actual motion that arises as a result of a sequence that was generated in accordance with a 2:2 pull down ratio.