1. The Field of the Invention
The present invention relates to the field of digital video. More specifically, the present invention relates to systems for adaptively converting interlaced fields of video into progressive frames on a per pixel basis.
2. The Related Art
Video information may be represented by progressive video or interlaced video. Modern computer monitors typically display progressive video. Conventional television monitors and older computer monitors typically display interlaced video. High definition television may display both interlaced and progressive video.
Progressive video includes a series of frames, where each frame is drawn as consecutive lines from top to bottom. In interlaced video, each frame is divided into a number of fields. Typically, the frame is divided into two fields, one field containing half of the lines (e.g., the even numbered lines), and the other field containing the other half of the lines (e.g., the odd numbered lines). The interlaced video, however, is still temporally ordered so that neighboring interlaced fields may represent video information sampled at different times.
There is often a need to convert interlaced video into progressive video and vice versa. For example, suppose a television broadcaster transmits a conventional television program as a series of interlaced fields. If these interlaced fields are to be displayed on a modem computer monitor (or on a high definition television display) that displays progressive frames, the interlaced fields must be converted into progressive frames.
The conversion involves using one or more fields of interlaced video to generate a frame of progressive video and repeating the process so that a stream of interlaced video is converted into a stream of progressive video. This conversion is often called xe2x80x9cdeinterlacingxe2x80x9d. There are several conventional methods of deinterlacing.
One conventional deinterlacing method is called xe2x80x9cscan line interpolationxe2x80x9d in which the lines of a single interlaced field are duplicated to form a first half of the lines in the progressive frame. The second half of the lines in the progressive frame are formed by simply duplicating the same field again and inserting the field offset by one line into the second half of the lines to complete the progressive frame. This basic form of scan line interpolation is computationally straightforward and thus uses little, if any, processor resources. However, the vertical resolution of the progressive frame is only half of what the display is capable of displaying.
One variation on the scan line interpolation method is that the second half of the lines in the progressive frame are generated by interpolating (e.g., averaging) the neighboring lines in the interlaced field. This requires somewhat more computational resources, but results in a relatively smooth image. Still, the vertical resolution is only half of what the display is capable of displaying.
One deinterlacing method that improves vertical resolution over scan line interpolation is called xe2x80x9cfield line mergingxe2x80x9d in which lines from two consecutive fields are interweaved to form a progressive frame. However, the video information in the first field is not sampled at the exact same moment as the video information in the second field. If there is little movement in the image between the first and second fields, then field line merging tends to produce a quality image at relatively little processing costs. On the other hand, if there is movement between the first and second fields, simply combining fields will not result in a high fidelity progressive frame since half the lines in the frame represent the video data at a given time, and half the lines in the frame represent a significantly different state at a different time.
Higher processing methods use complex motion compensation algorithms to determine where in the image there is motion, and where there is not. For those areas where there is no motion, field line merging is used because of its improved vertical resolution. For those areas where there is motion, scan line interpolation is used since it eliminates the motion artifacts that would be caused by field line merging. Such motion compensation algorithms may be implemented by the motion estimation block of an MPEG encoder. However, such complex motion compensation methods require large amounts of processing and memory resources.
Therefore, what are desired are systems for deinterlacing to provide a relatively high fidelity progressive frame without having to dedicate the processor and memory resources required by complex motion compensation algorithms.
The principles of the present invention provide for the adaptive deinterlacing of interlaced video to generate a progressive frame on a per pixel basis. In a first embodiment of the present invention, two consecutive fields of interlaced video are converted into a frame of progressive video. One of the fields is replicated to generate half the lines in the progressive frame. Each of the pixels in the other half of the progressive frame are generated pixel-by-pixel.
Specifically, for a given output position of the pixel in the other half of the progressive frame, a correlation is estimated between the corresponding pixel in the non-replicated field and at least one vertically adjacent pixel of the replicated field, and optionally one or more vertically adjacent pixels in the non-replicated fields. In one example, a window of pixels one pixel wide by five pixels high is evaluated centering on the pixel in the non-replicated field that corresponds to the output pixel position.
A value is then assigned to the output pixel that corresponds to the output position, the value depending on the correlation. The deinterlacing in accordance with the present invention interpolates between scan line interpolation and field merging depending on the correlation. For example, if there is a high vertical correlation, then more of field merging is performed for that pixel since a high correlation suggests less likelihood of movement at that pixel position. If there is a low vertical correlation, then more of scan line interpolation is performed for that pixel since a low correlation suggests more likelihood of movement at that pixel position. If there is moderate correlation, a balance of scan line interpolation and field merging is performed. This process is repeated for each pixel in the other half of the progressive frame until the entire progressive frame is generated.
Thus, unlike pure scan line interpolation or pure field line merging, the deinterlacing in accordance with the present invention adaptively uses a portion of each method depending on how much motion is detected at the pixel. The mechanism for estimating motion in accordance with the present invention is not as sophisticated as the conventional complex motion compensation methods. However, the mechanism for estimating motion compensation in accordance with the present invention provides suitable motion estimation for many video applications. In addition, the deinterlacing algorithm in accordance with the present invention does not require the extensive processing and memory resources that the complex motion compensation methods require. Therefore, the deinterlacing of the present invention is ideally suited for video applications in which processing and memory resources are limited.
In a second embodiment of the invention, three consecutive input fields of interlaced video are converted into two output fields of interlaced video. The second temporal input field is replicated to produce a first of the two output fields. The other field is generated on a per pixel basis.
Specifically, for a given output pixel corresponding to an output position of the second output field, at least one pixel of the second temporal input field that is vertically adjacent to the output position of the second output field is used to determine which of the first temporal input field and third temporal input field more closely correlates to the second temporal input field at the output position. In one specific case, the upper pixel of the second temporal field (the upper pixel being directly above the output position of the second output field) is accessed. In addition, the lower pixel of the second temporal field (the lower pixel being directly below the output position of the second output field) is accessed. The upper pixel and the lower pixel are then averaged. This averaged value is then used to compare to the value of the corresponding pixel in the first temporal input field and to the value of the corresponding pixel in the third temporal field.
Then, a value is assigned to the output pixel that is based on the correlation at the output position between the first temporal input field and the second temporal input field, and between the third temporal input field and the second temporal input field. In a specific example, the value leans toward the value of the pixel in whichever of the first temporal input field or third temporal input field is closer at the output position to the averaged value.
In one example, a blending factor is used to determine how much of the value of the pixel in the first temporal input field at the output position, and how much of the value of the pixel in the third temporal input field at the output position is weighed in assigning the value to the output pixel. If, for a given pixel, the averaged value is closer to the value of the pixel in the first temporal input field at the output position, then the value of the blending factor is altered in one direction. If, on the other hand, the averaged value is closer to the value of the pixel in the third temporal input field at the output position, then the value of the blending factor is altered in the opposite direction. The altered blending factor is carried forward for the analysis of other pixels. Thus, the blending factor changes as pixels in a given line are generated. The blending factor may be reset to a neutral value as each line begins.
The second embodiment takes into consideration which of the first and third temporal input fields are closer to the second temporal input field when determining how much of the first temporal input field and how much of the third temporal input field should be used in generating the second temporal input field. Thus, if there is a big difference between the second and third temporal input fields, the first output field will be the second temporal input field while the second output field will tend more towards the first temporal input field. Likewise, if there is a big difference between the first and second temporal input fields, the first output field will be the second temporal input field while the second output field will tend more towards the third temporal input field. This is especially useful with performing inverse telecine.
Additional features and advantages of the invention will be set forth in the description, which follows, and in part will be obvious from the description, or may be learned by the practice of the invention. The features and advantages of the invention may be realized and obtained by means of the instruments and combinations particularly pointed out in the appended claims. These and other features of the present invention will become more fully apparent from the following description and appended claims, or may be learned by the practice of the invention as set forth hereinafter.