This invention relates to video image processing, and more particularly to the assigning of motion vectors indicating the direction and magnitude of apparent movement to different regions of the image, to assist in the generation of desired output images.
Our United Kingdom Patent No. GB-B-2,188,510 and BBC Research Department Report RD 1987/11 describe a method of processing a video image so as to provide a list of motion vectors which are applicable over the whole area and one of which may be regarded as applying to each region of the image. Other methods are also possible for generating such a list of motion vectors. To use such vectors it is then necessary to select which of these vectors may apply to given regions of the picture. Each region may be as small as a picture element (pixel) or it may comprise a plurality of picture elements or a block of the picture.
The motion vectors may be used, for example, to generate output fields which correspond to an instant in time which lies intermediate the time of two input fields. This may be required for instance in producing slow motion effects, in transferring video images to or from film, or in standards coversion.
One of the most challenging applications of motion compensation is to generate slow-motion sequences without the jerky motion that results from simply repeating each image a number of times. Knowledge of the motion vector of each object in the image allows new images corresponding to any time instant to be generated showing the objects correctly positioned. The use of such a technique in conjunction with a shuttered CCD camera should allow sharp smoothly-moving pictures to be generated with a quality approaching that obtainable from a high frame-rate camera, without the operational problems that the use of such a camera would entail.
A typical image may be regarded in its simplest form as having a moving foreground region and a background region, as illustrated somewhat diagrammatically in FIG. 1 of the drawings. At (a) is shown one field of an image comprising a foreground object, such as a ball, in front of a background. At (b) is shown the next field of the image. The ball will have moved from position A to position B. Looking at image (b), part of the background which was seen in (a) is now obscured, and part of the background which was not seen in (a) is now revealed or uncovered.
In general the background may also be "moving" in the image if, for example, the camera is being panned. Thus motion vectors will be associated both with the foreground and with the background respectively. The appropriate motion vector is chosen in each case from the list of possible motion vectors (which has been produced eg. as described in our aforementioned patent) by comparing the two successive fields and looking at the movement which has taken place over different regions of the image.
This operation will provide accurate information over most of the picture area. Thus the background which is not covered by either image position of the ball, A or B, can be compared between the two images. Also the overlap region covered by both position A and position B of the ball can be compared to provide an appropriate vector. However in both the area of obscured background and the area of revealed or uncovered background, one of the fields contains an image of the ball and the other an image of the background. These can not be meaningfully correlated.
Many motion estimation algorithms have been proposed, see e.g. Proceedings of the International Zurich Seminar on Digital Communications, March 1984, pages D2.1-D2.5, Bergmann, H. C., "Motion-adaptive frame interpolation". Most can not detect motion in such regions, or can detect such regions only when the background is stationary, but of those that can detect its existence we do not believe that any can determine the direction or magnitude of motion of the region at the time that it disappears from view or when it reappears.
European Patent Application No. EP-A-0 395 264 published Oct. 31st, 1990 describes equipment for converting an 1125/60/2:1 HDTV signal into a 24 Hz progressive (non-interlaced) format for recording onto film, which uses a motion estimation technique which comprises a two-stage algorithm in which the first stage comprises correlating (by block matching) relatively large areas of the image to determine a list of possible motion vectors, followed by an assignment process to allocate vectors to individual pixels. In this assignment process, vectors are assigned to pixels in existing input fields, and this information then has to be converted to refer to pixels in desired output fields. Three input frames are used in the comparison.