The present invention relates generally to a video format conversion system, and in particular to a video format conversion system for converting interlaced video to progressive video interlaced video to interlaced video of a different size.
Interlaced signals are generated from a line-by-line scan of an image scene. The signals are generated by scanning and transmitting every other line of the image scene. In this way, even lines of the image scene are scanned and transmitted and then odd lines are scanned and transmitted. The even and odd lines in the image scene are referred to as the even field and the odd field, respectively. A time delay between the image scene capture of the even and odd fields is approximately one sixtieth of a second. A combination of the even field and the odd field is often referred to as a frame of image data. The frame comprises information required to represent the entire image scene.
At an interlaced monitor, the fields are displayed by alternately rendering the information in the even field to the even lines on the monitor and the information in the odd field to the odd lines on the monitor. The overall effect, as perceived by a viewer, is the complete reconstruction of the image scene. That is, the image scene appears to contain all vertical information. Despite the apparent reconstruction, displaying interlaced video signals has several drawbacks, which include introducing artifacts associated with interlaced signals.
Alternately, the interlaced signal may be deinterlaced for display on a progressive monitor. The progressive monitor displays each line of the image scene progressively. That is, each line of data is displayed in order, starting from the top row of the display and progressing row by row to the bottom. Furthermore, progressively scanned display formats present all lines in the image scene at sixty frames per second. However, interlaced video signals only transmit half of the image scene every one sixtieth of a second. Since there is no delay between the presentation of the even and odd rows of image scene information, the number of scan lines of data visible at a given instant in time is twice that which is visible in a corresponding interlaced system. Thus, there is an information shortfall during format conversion. The format conversion challenge is that of reconstructing the entire image scene at an instant in time, even though only half of it is available at that time.
Format conversion from an interlaced signal to a progressive signal may be accomplished in a number of ways. One of the simplest methods is field meshing. Field meshing is a process whereby the even field information is copied to the even lines and the odd field information is copied to the odd lines. There is no regard for the temporal delay between the even and odd fields. An obvious problem with this approach is an abhorrent visual quality that results when motion is present in the image scene. Specifically, a well-known artifact referred to as xe2x80x9cfeatheringxe2x80x9d results.
Line doubling is alternate manner of interlaced-to-progressive conversion. Line doubling involves the interpolation of missing lines based only lines in the field available at that time. The line doubling method has several drawbacks, one of which is the complete discounting of relevant image scene information in a previous field. Line doubling can lead to flicker and to a loss of vertical detail. Flicker is usually most noticeable around hard edges such as lines and text within the image scene, and generally in areas containing fine detail.
Flicker is the result of horizontal, or near horizontal, structures in the image scene having a spatial frequency content that cannot be adequately represented by an interlaced signal because an interlaced image is sub-sampled vertically. A stationary horizontal edge, for example, lies on an even row or an odd row, at any given instant in time, in an interlaced image. Therefore, the edge frequently alternates back and forth between the even and odd field, causing a viewer to perceive the edge as flickering. Flicker reduction is important because many menus such as On Screen Displays, DVD Menus, and other forms of information containing hard (high contrast) and persistent edges need to be kept stable.
Other known artifacts of line doubling include stair-stepping. Stair-stepping occurs because an interlaced image is vertically sub-sampled. That is, not all the row information in the image scene is available at the same instant in time. The consequence is stair-stepping artifact.
Format conversion algorithms aim to minimize the artifacts that result during conversion. Conversion can take on the form of interlaced-to-interlaced conversion, or interlaced-to-progressive conversion.
Yet an alternate form of deinterlacing employs what is known as motion compensated interpolation. Motion compensated interpolation attempts to compensate for the motion in the image scene by identifying regions or groups of pixels in a previous field and placing them in their correct spatial orientation relative to the current field. Such methods usually employ a block matching strategy that aims to minimize a measurement function aided by a computed motion vector. Such strategies often depend upon vast computational resources to operate effectively, and even then, mismatching errors can result. Mismatches can be perceptually dissatisfying to a viewer, especially when a portion of the image content is placed in the wrong part of the target image.
U.S. Pat. No. 6,141,056 issued to Westerman discloses a video format conversion system for converting interlaced video into progressive video using motion compensation. However, the system described by Westerman involves calculating derivatives and intervals of various quadratic functions. Such a system still requires huge computational power. Furthermore, research has shown that adjusting the filters based solely on motion information results in blotchy processed sequences.
Several format conversion solutions pertaining to deinterlacing involve a technique called vertical-temporal (VT) filtering. This technique is used in an attempt to deal with the problems that arise due to the sub-sampled nature of interlaced video. The basic idea behind VT filtering is to combine pixels in adjacent fields by numerical interpolation in order to compute a target pixel in the processed (deinterlaced) image. However, these methods cannot scale and deinterlace simultaneously. Nor do they that deal effectively with the artifacts inherent in interlaced images, namely flicker, stair-stepping and especially artifacts that are motion-related such as feathering.
Therefore, it is an object of the present invention to obviate or mitigate at least some of the above-mentioned disadvantages.
Motion Adaptive Vertical Temporal (MAVT) filtering is useful in format conversion. Format conversion takes on the form of interlaced-to-progressive scan or interlaced-to-interlaced (reinterlacing) conversion in which scaling and deinterlacing, and scaling and reinterlacing occur simultaneously. A MAVT filter combines pixel information for adjacent field. The MAVT adjusts a multi-dimensional filter to alter the contribution from each pixel in the adjacent fields for determining a target pixel. The format conversion method examines various aspects of the pixel information content in the image scene. Pixels are examined in a neighborhood about the interpolated target pixel along several axes. These axes include, among others, a spatio-temporal axis, a noise axis, a motion axis, and an image structure axis such as lines and edges. In addition, historic information from previous pixel data is used to help determine the best set of coefficient weights to apply to the source image data in order to generate the target pixel during format conversion.
In accordance with an aspect of the present invention there is provided an adaptive filter for calculating a target pixel from an interlaced video signal comprising a plurality of frames. Each of the frames comprises an even and an odd field. The filter comprises a quantized motion calculator for estimating an amount of motion about said target pixel and a filter selector for selecting a filter in accordance with the estimated amount of motion. The filter applies a first weighting factor to a plurality of current field pixels and a second weighting factor to a plurality of previous field pixels for creating the target pixel.
In accordance with a further aspect of the invention, there is provided a feathering detector for detecting a feathering artifact in adjacent odd and even fields about a target pixel. The feathering detector comprises a contour selector, a difference calculator and a plurality of predefined thresholds. The contour selector selects from a plurality of contours, including non-linear contours. The difference calculator calculates a plurality of differences between pixels along the selected contour. The plurality of predefined thresholds compared with the calculated differences for determining if feathering exists about the pixel. Including non-linear contours among the contour selection improves feathering detection in arbitrary shapes.
In accordance with yet a further aspect of the invention, there is provided a method for selecting a filter for calculating a target pixel from an interlaced video signal. The signal comprises a plurality of frames, each of which comprise an even and an odd field. The method comprises the steps of estimating an amount of motion between consecutive frames about the target pixel, detecting feathering between consecutive fields, detecting vertical edges about said target pixel, and detecting patterns indicative of motion and patterns indicative of no motion. The estimated amount motion is adjusted in accordance with the feathering, edge and pattern detection and a filter is selected accordingly. A first weighting factor is applied to a plurality of current field pixels and a second weighting factor is applied to a plurality of previous field pixels for creating the target pixel.