Two to one interlaced scanning, in which the spatial sampling grid is offset vertically by half the vertical sample pitch on alternate temporal samples, is a very common method of reducing the bandwidth of television images. Now that more modern (often transform-based) data compression methods are available and electron-beam-scanned displays are less common, the use of interlaced scanning is becoming less attractive.
De-interlacing can form part of many video processes; by de-interlacing at the input to a process, that process can be made easier. Examples include standards conversion and re-scaling.
It is therefore frequently necessary to convert an interlaced image sequence to a progressively scanned sequence so that the converted images occur at the temporal sampling rate (i.e. the field rate) of the input sequence and all of them have the same spatial sampling structure. Each converted output image of the sequence thus has twice as many vertical samples (scanning lines) as each original input image; and, pixel data values are available for all vertical sample positions and temporal sample points.
This conversion is a spatio-temporal interpolation process and it is generally known as “de-interlacing”. It has been found that best subjective results are obtained by de-interlacing systems which take into account the motion of portrayed objects: either by motion adaptation, in which movement is detected and the interpolation changed as a result; or, by using motion compensation, in which the change in position of objects between consecutive fields is measured, and used to “compensate” the positions of pixels in one field to the positions that the objects they portray would occupy in a different field. These positional changes are commonly described by two-dimensional “motion vectors” and in typical motion compensated processes one or more vectors are associated with each image pixel.
Several methods of motion vector derivation are known in the art; the most common are phase-correlation and block-matching. In phase correlation blocks of contiguous pixels are transformed into the spatial frequency domain and the phases of the spatial frequency components are correlated between co-located blocks in consecutive fields. In block matching the values of groups of contiguous blocks of pixels are compared with similar, but spatially shifted, groups of pixels in adjacent fields so as to find the shift vector which gives the best match of the pixel values. Typically the match error is evaluated as a sum of the magnitudes of pixel value differences over a block of pixels.