1. Field of the Invention
The present invention is generally related to image processing, and more particularly, to correlation of multiple images to detect image motion.
2. Background Information
One aspect of image processing involves estimating the motion between successive images for, among other reasons, creating synthetic images that allow for conversion from one frame rate to another. Such techniques are applied to both progressive images and interlaced images (e.g., progressive or interlaced television images). Progressive images constitute complete images, for which complete pixel information is available. In a typical video signal, wherein frames of image data are broken into even and odd fields, a progressive image would be a composite of the even and odd fields associated with a particular frame. Progressive images can be contrasted with interlaced video images, which contain sequential fields that are spatially and temporally nonaligned (such as the even and odd field associated with a typical video signal). A synchronous frame rate conversion involves generating synthesized frames of interpolated image information.
With regard to interlaced images, an additional problem exists due to the spatial and temporal offset which exists from one image to the next. For example, in a single video frame constituted by two interlaced fields of information separated in space (e.g., by one line), and separated in time (e.g., by one half of a frame time). One field includes the odd numbered scan lines of an image, while the next successive field includes the spatially offset even numbered scan lines. The temporal separation between these successive fields causes an additional spatial offset of the first video field with the second video field due to image motion. To perform digital image processing on interlaced video images, the interlacing must be removed so that image processing algorithms can operate on an entire, coherent frame of video information. Accordingly, it is desirable to remove the interlacing and spatially align the two video fields into a coherent video frame.
Techniques for detecting image motion among sequential video images generally fall into two categories: (1) phase correlation techniques using, for example, fast Fourier transforms; and (2) block matching. Phase correlation is described in a document entitled xe2x80x9cThe Engineer""s Guide To Motion Compensationxe2x80x9d by John Watkinson, 1994: Snell and Wilcox Ltd., the disclosure of which is hereby incorporated by reference in its entirety. As described therein, phase correlation involves spectral analysis of two successive fields, and then subtracting all phases of the spectral components. A reverse transform is applied to the phase differences to identify peaks whose position correspond to motions (i.e., motion vectors) between the fields. Because the phase correlation does not identify the pixel locations associated with the motion, a block matching process is required to select the best motion vector detected from the phase correlation which correlates motion in the image to pixel locations.
A block matching technique is described in U.S. Pat. No. 5,016,102 (Avis), the disclosure of which is hereby incorporated by reference in its entirety. In this patent, image motion is detected by comparing blocks in a first field or frame of a video signal with blocks in a following field or frame of the video signal to derive motion vectors. Block-based motion estimation typically involves operating on blocks of image pixels of 8xc3x978 or 4xc3x974 pixels per block. These estimators attempt to correlate blocks from a first field with blocks from a second field to measure interfield motion. A correlation surface representing the differences in content between the block in the first field or frame and the content of each block in the following field or frame with which it has been compared is then produced. The correlation surface is examined to determine whether a clear minimum value of the differences exists, as a representation of the motion vector, associated with image data included in the block. The motion vectors thus represent the motion of content in respective blocks of the first field or frame with respect to the following field or frame.
However, these known techniques are unable to achieve adequate correlations when operating on detailed imagery with complex motion. These techniques can not definitively produce correlation vectors for all sequential images with perfect accuracy. Rather, there is room for improvement in these techniques.
The present invention is directed to improving the accuracy with which video images are processed to detect image motion among sequential images (e.g., progressive non-interlaced images of a video signal and/or interlaced fields of a video signal), each image being represented using a plurality of pixels. Exemplary embodiments replace block-based motion estimation with pixel-based motion estimation. A correlation surface is generated for every pixel in a reference image, from which a motion vector and confidence metric (i.e., a measure of confidence in the accuracy of the motion vector) are extracted for each pixel.
In accordance with exemplary embodiments, first information obtained from pixels used to represent a first image is compared with respect to second information obtained from pixels used to represent a second image, to produce a correlation surface representative of image motion. A motion vector can be extracted from correlation data included in the correlation surface as a measure of image motion among the at least two images. Subsequently, a measure of confidence in the accuracy with which the motion vector has been generated is produced, the measure being determined using a first point (e.g., best) on the correlation surface and a second point (e.g., second best) on the correlation surface, the second correlation point being spaced at least a predetermined distance from the first point on the correlation surface. Pixel correspondence and the span of a search area in the image over which a sample block is moved are variable parameters of the motion estimation which can be set as a function of the particular application and computational capability of the system used.
When processing interlaced fields of image data, estimation of motion is complicated by the fact that consecutive fields are spatially nonaligned. Accordingly, for interlaced video fields, the present invention computes correlation surfaces for each pixel using two different methods, and the results are then combined to provide a single correlation surface for each pixel, from which a motion vector and associated confidence metric can be produced for each pixel. The first method, intraframe correlation, is a measure of motion between a first reference field and a temporally spaced second field (e.g., an odd field and the next successive even field). First information obtained from pixels of the first field is compared with second information obtained from pixels of a second field, the second field being temporally separated and spatially nonaligned with respect to the first field, to generate a first correlation surface. The second method, interframe correlation, detects motion between the first field and a third field that is temporally spaced two fields (i.e., one frame) from the first field. That is, the first information is compared with third information obtained from pixels of a third field, the third field being temporally separated and spatially aligned with the first field, to generate a second correlation surface. The first and second correlation surfaces for each pixel are combined into a composite correlation surface, and a motion vector is extracted from correlation data of the composite correlation surface as a measure of image motion among at least two images. A confidence metric can be produced for each motion vector, in a manner similar to that described with respect to progressive video images, as a measure of confidence in the accuracy with which the motion vector has been produced.