1. Field of the Invention
The invention relates to motion estimation on sub-sampled (interlaced) video data for e.g. scanning format conversion.
The analysis of motion in a video sequence is embedded in many advanced video processing techniques, like advanced coding, noise reduction, and scan rate conversions such as, de-interlacing and Frame Rate Conversion (FRC). In coding applications, motion information is used to minimize the prediction error and consequently, the bit rate. Temporal noise filtering is very effective, but requires Motion Compensation (MC) if objects in the image move. De-interlacing, which is a prerequisite for high quality Motion Estimation (ME) and vertical scaling of interlaced video, requires information from multiple fields, which need therefore motion compensation. In case of scan rate conversion, motion information is needed for temporal interpolation that eliminates jerkiness of the motion portrayal and motion blur.
2. Description of the Related Art
Different applications may demand different types of motion estimators. In case of coding, motion estimation is based on minimizing the difference between image parts, and coding the resulting error. The resulting motion vectors do not necessarily reflect to true motion within the video sequence. The motion estimation can be optimized to find the smallest error, resulting in the most efficient coding. For scan rate conversion applications, temporal interpolation is realized in the direction of the motion. Consequently, it is important that the motion vectors describe the true motion vectors within the video sequence. It is therefore not sufficient to demand for the smallest difference between image parts in the motion estimation. The motion behavior in the environment is also an important factor.
Motion estimations designed for scan rate conversion can be used for coding purposes, but not immediately vice versa. The motion estimation described in this paper is therefore not restricted to scan rate conversion but designed to find the true-motion vectors.
Most video data is available in a so-called interlaced format, i.e., a format in which the odd scan lines in odd fields and even lines in even fields together constitute a frame that describes an image, but the odd and even fields are NOT describing the image at the same temporal instance. Motion in video data is preferably measured on the shortest available time interval, i.e., the field interval in case of interlaced data. Due to vertical detail in the picture (causing alias in the sub-sampled field-grid), however, it may be impossible to correctly find the vertical displacement between two consecutive fields. The alias pattern and the original detail may move differently.
Consider a stationary horizontal white line of two scanning lines width available in an interlaced video format. Odd and even fields will both show a single white line. Considering the information in two successive fields, it is ambiguous whether we deal with a one scanning line wide white line moving with a vertical velocity of 1 sample/field-period, or a stationary horizontal white line of two scanning lines width.
A common way to deal with the mentioned problem, is to first up-convert at least one of the two fields into a progressive format, and perform the motion estimation between two frames or between a frame and a field [see Ref. 2]. An alternative exists in estimating the motion applying data from three successive fields [see Ref. 3]. Finally, it is possible to hope that the low-frequency content of the image is dominant, correct the phase for those frequencies, and estimate motion more or less neglecting the problem [see Ref. 1].
A characteristic that Ref. 2 and Ref 3 share, is that they double the access to the previous image data (from just the previous field, to the previously de-interlaced field, or the previous and the pre-previous field, respectively). Also the solution proposed in Ref. 3 introduces a new constraint in the motion estimator, in that the motion over a two-field-period has to be assumed constant. The option of Ref. 1 does not double the memory access (and capacity!) and introduces no constant motion constraints, but cannot solve the ambiguity either, as experiments on critical picture material show.