The present invention relates generally to the processing of video images and more particularly to techniques for converting video sequence formats with scaling. The multiplicity of video formats requires to adapt the format of an incoming video to the format required by a display device or for transmission. In an interlaced video, each successive image, also called field, provide alternatively in time even and odd lines. Simple Definition television (SDTV) such as NTSC, PAL and SECAM television format, are interlaced. An interlaced monitor such as a cathode ray tube (CRT) is designed to display these fields, with an interlaced scanning of even and odd lines. Progressive video image formats incorporate all lines (even and odd) in each image. LCD video monitors are designed to display progressive videos. For an input interlaced video it is thus necessary to perform a deinterlacing that compute respectively the missing odd and even lines in each field of the interlaced video sequence, to output a progressive video. Several High Definition Television formats have also been adopted, with a progressive format 720 p of 720 lines or an interlaced format 1080 i of 1080 lines. Converting an SDTV video format to an HDTV video format typically requires to perform a deinterlacing and change the number of lines from say 486 for NTSC to 720 for 720 p HDTV.
Deinterlacing and changing the number of lines of a video sequences is a scaling operation that converts an input video defined on an interlaced sampling grid to an output video defined on a output sampling grid. A deinterlacing is a scaling where the output sampling grid include both even and odd lines. A further modification of the number of lines further modifies the output sampling grid. Scaling videos requires to compute pixel values on a prescribed output sampling grid from the pixel values of the input video defined on a different input sampling grid.
A first approach is to compute each output video image at a time t from the input video image at t using an FIR filtering to perform a spatial interpolation that computes samples on the output grid from the input sample values. When the input image is interlaced, it is sampled much below the Shannon sampling rate, and interpolations thus produce artifacts that are highly visible in the resulting video. Directional interpolations can partly reduce these artifacts but not completely because the information necessary to perform the interpolation is not available. If the input image is progressive, increasing the sampling density also produce a blurred image. U.S. Pat. No. 6,614,489 provides an example of this approach.
A second approach called “motion adaptive” performs a time interpolation if no motion is locally detected in the image or a spatial interpolation if a significant motion is detected. In presence of significant motion, these techniques implement a spatial interpolations and thus produce the same type of artifacts as spatial interpolation techniques. U.S. Pat. No. 5,428,398 provides an example of this approach.
A third approach called “motion compensated,” measures the displacement in the image and perform an interpolation in time that takes into account the measure motion. By combining the information of several images in time, such techniques can improve the spatial resolution of the image after scaling and restore finer details. This is called a super-resolution processing. Yet, existing motion compensated techniques are unstable and introduce visible artifacts. Indeed, computing precisely the displacement of image pixels is difficult, specially with interlaced images that are undersampled. Motion discontinuities at occlusions or fine scale motions such as produced by dust, or motion transparencies produce motion estimation errors that yield artifacts when using these motion vectors for time interpolation. To reduce these artifacts, motion compensated scaling procedures perform a spatial interpolation when the confidence in the measured motion is not sufficient. The reduction of motion artifacts is then performed at the cost of introducing artifacts introduced by the spatial scaling. U.S. Pat. No. 6,940,557 provides an example of this approach.
None of the state of the art procedures for video scaling fully takes fully advantage of the space-time information provided by an input-video sequence in order to perform a super-resolution scaling of high quality. Therefore, it is an object of the present invention to obviate or mitigate at least some of the above-mentioned disadvantages to convert video sequences with a scaling procedure.