The present invention is directed to a method for the determination of motion vector fields from digital image sequences, in which a motion vector field is calculated from two successive images, said motion vector field relating every picture element of an image to a picture element of the other image, whereby the relation is respectively defined by a motion vector that reproduces the relative shift of the picture elements relative to one another, and whereby all picture elements in a square or rectangular block of picture elements receive the same motion vector.
It is necessary for various applications, for example, image data compression or machine vision (e.g. robots and automated scene analysis), to automatically acquire the shifts of the image contents from image frame to image frame in a digital image sequence that result from object movements or from camera movements. These shifts of the local image contents can be represented by motion vector fields that indicate, for example, for every picture element of an image, by how much the image content has shifted at this location in comparison to the preceding image frame.
In, for example, image data compression for the purpose of transmitting digital images with low data rates, the motion vector fields can be used to predict the next image frame that has not yet been transmitted from image frames that have already been transmitted. The data rate that is required for the transmission of the new image frame is all the lower the better this prediction can be made.
A further application of the motion vector fields is the reconstruction of missing image frames from an image sequence that was temporally subsampled for the purpose of data compression. For example, this means that only every third image frame of the sequence is available and the two missing image frames of the sequence are to be interpolated as optimally "correct in motion" as possible between two respectively existing images frames (the "point of reference images"), so that the motion of subjects in the reconstructed scene are executed as uniformly as in the original. Motion vector fields are required for this purpose, these indicating which picture elements are to be used in the two appertaining reference images for the reconstruction of every picture element of an image frame to be interpolated.
In every instance, a motion vector that describes the local motion with two components, namely, the horizontal and the vertical motion commponent, is allocated in the motion vector fields, for every picture element of an image frame or to a respective group of neighboring picture elements.
One problem in the determination of such motion vector fields results because the movements present in an image frame sequence are usually dependent on the location of the picture elements, so that a plurality of different motion vectors can occur in a small picture detail, particularly at the edges of moving subjects. For determining a motion vector for a specific picture element, only this picture element itself should actually be considered. On the other hand, a motion vector cannot be determined from a single picture element for the reason that the motion vector contains two components and every individual picture element defines only one equation for these two unknowns, cf., for example, B. K. P. Horn, B. G. Schunck, "Determining Optical Flow", Artificial Intelligence 17, Pages 185-203, 1981. Even in a small environment or surround around the picture element, however, the image content is often structured to such a slight degree that the motion at the location of the appertaining picture element cannot be unambiguously identified. This produces the difficulty that, first, only small environments or surrounds are to be used for the calculation of a motion vector in regions having motion vectors that are highly dependent on location and second, large environments or surrounds are required in regions having image contents that are not clearly structured, such being required in order to be able to unambiguously recognize the motion. It is therefore necessary to vary the size of the respective environments, and an assumption of a defined smoothness of the motion vector field must also be utilized in order to obtain motion vectors useable for the above applications even in the use of grainy noise-infested image frames, and in picture details that have little differentiation.
Essentially three different approaches to motion vector estimation have been previously investigated, cf., for example, H. G. Musmann, P. Pirsch, H.-J. Gallert, "Advances in Picture Coding", Proc. IEEE 73 (1985) 4, Pages 523-548, namely,
(1) Block matching method, PA1 (2) Differential method, PA1 (3) Methods that work with distinctive points.
The operations of these methods shall be set forth briefly below for the case in which the shift of the picture contents in comparison to the predecessor picture (Picture A) is to be identified for a picture (for example Picture B of a picture sequence).