Motion estimation, being important in its own right in the broad area of computer vision, is of utmost importance in almost every aspect of image sequence processing. Interframe motion information allows the development of algorithms that take advantage of naturally existing redundancies among the frames of an image sequence. The difficulties inherent to the motion estimation problem, and developing novel strategies for utilizing the interframe motion information in the context of various processing tasks, pose challenges in the field of image sequence processing.
Image sequence processing is concerned with problems such as interframe motion estimation, temporal frame interpolation, noise filtering, restoration and data compression. The present invention is concerned with two of these problems in particular: motion estimation and noise filtering.
The importance of reducing the noise in image sequences is growing with the increasing use of video and television systems in numerous scientific, commercial and consumer-oriented applications. A human observer can potentially obtain more information from an image sequence when the noise is reduced. In cases where the noise is not visually perceivable, reduction of noise increases the efficiency of subsequent processing that may be applied to the image sequence, such as data compression.
There are two major temporal-domain approaches to image sequence filtering: (1) the motion-compensation approach, and (2) the motion-detection approach. In motion-compensated filtering, first a motion estimation algorithm is applied to the noisy image sequence to estimate the motion trajectories, i.e., locations of pixels (or subpixels) that correspond to each other at a predetermined number of contiguous image frames. Then, the value of a particular pixel at a certain frame is estimated using the image sequence values that are on the motion trajectory passing through that pixel. The estimation is performed using either an infinite impulse response (IIR) or a finite impulse response (FIR) filter structure.
In contrast, methods based on motion detection do not attempt to estimate the interframe motion. Instead, direct differences of pixel values at identical spatial locations of two adjacent frames are computed to detect the presence of interframe motion. An estimate of the pixel value at a certain location of the present frame is determined by applying an FIR or an IIR filter structure to pixels at identical spatial locations of a predetermined number of past and/or future frames. The filter coefficients are functions of the "motion-detection signal" which is defined as the difference between the pixel value of interest at the present frame and the pixel value at the same location of the previous frame. Certain IIR filter structures for temporal filtering on the basis of motion detection have been proposed in the prior art, as well as a variety of other motion-detection based filtering methods.
Generally speaking, the performance of these two approaches is determined by the filter structure, dependence of the filter structure to the motion-detection signal (in case of the motion-detection approach), and the performance of the motion estimation algorithm (in case of the motion-compensation approach). Motion-compensated filtering methods tend to be more complex due to interframe motion estimation. On the other hand, they are potentially more effective than those based on motion detection because they make use of the interframe motion information. In practice, however, the success of a motion-compensated method is strongly dependent on the success of motion estimation.
In an ideal setting, where the scene contents remain unchanged from one frame to another and the motion estimation algorithm is not affected by noise, direct averaging of image values over motion trajectories provides effective noise reduction. In fact, under independent white Gaussian noise assumption, the average is a maximum likelihood estimate of the pixel value. In practice, however, scene contents change from one frame to another, e.g., due to camera panning and existence of covered/uncovered regions. As a result, image values over an estimated motion trajectory may not necessarily correspond to the same image structure and direct averaging may result in oversmoothing of image details. Therefore, the noise filtering algorithm should be temporally adaptive. In one extreme, when the motion estimation is accurate, it should approach direct averaging. In the other extreme, when the motion estimation is inaccurate, it should not perform any filtering. Indeed, the motion estimation method should be able to provide good estimates in the presence of noise as well as in the case of varying scenes in order to allow for effective noise reduction.
The adaptivity requirement outlined above is satisfied when the local linear minimum mean square error ("LMMSE") point estimator that has been derived by Kuan, et al. and by Lee is applied in the temporal direction along the motion trajectories (See: D. T. Kuan et al., "Adaptive Noise Smoothing Filter for Images with Signal-Dependent Noise", IEEE Trans. Pattern Anal. Machine Intell., PAMI-7, pp. 165-177, March 1985; and Lee, "Digital Image Enhancement and Noise Filtering by Use of Local Statistics", IEEE Trans Pattern Anal Machine Intell., PAMI-2, pp. 165-168, March 1980.)
It was suggested by Martinez et al. ("Implicit Motion Compensated Noise Reduction of Motion Video Scenes", Proc. ICASSP, pp. 375-378, Tampa, Fla. 1985) to apply the adaptive LMMSE point estimator in the temporal direction. Due to the lack of motion estimators that are robust in the presence of noise, however, Martinez et al. used a cascade of five LMMSE estimators over a set of five hypothesized motion trajectories for each pixel, without estimating the actual motion. This approach can be regarded as a motion-detection approach rather than a motion-compensated approach since interframe motion is not estimated. Motion detection along a hypothesized trajectory is implicit in the adaptive nature of the estimator. Due to the adaptive nature of the estimator, filtering is effective only along the trajectory that is close to the actual one. This approach has been reported to be successful in cases where the hypothesized motion trajectories are close to the actual ones.