1. Field of the Invention
This invention relates to the frame interpolation of a moving image namely to a method and device for interpolating frames intermediate transmitted frames. The invention finds particular, but not exclusive, application to the transmission of moving images (e.g., in videophone systems).
2. Related Art
In the communication of a moving image at low bit rates for purposes such as videophone services, the necessity of making considerable cuts in the amount of information to be transmitted has frequently led to the adoption of techniques in which there is a reduction in the transmitted frame rate. However, when the frame rate is low, intermediate frames must be generated to provide a continuous image at the receiver. The intermediate frames can be obtained by repeating frames or preferably by attempting to obtain intermediate frames by interpolating from the received frames. This is illustrated in FIG. 8. A transmitter side transmits image frames Ta and Tb which are two temporally separated images of a moving input image. Frames Ta and Tb are received at a receiver side. An intermediate frame Ti is found by interpolating between frames Ta and Tb in order to produce an moving output image comprising three frames Ta, Ti and Tb instead of simply the two frames Ta and Tb.
When it is necessary to interpolate many frames between received frames it can be difficult to reproduce smooth movement at the receiver side. One known approach to solving this problem is to use motion vectors detected from the frames Ta and Tb, in the FIG. 8 example, to interpolate the missing frames at the receiver side.
In order to carry out such a method of frame interpolation of a moving image without giving rise to blurring or jerkiness, it is necessary to detect motion vectors which are visually correct. Methods which have been proposed for detecting motion vectors include:
(i) the method which takes the motion vector as the displacement giving the smallest predicted error power between frames; and PA1 (ii) the gradient method which uses the temporal and spatial gradients of the pixel values. PA1 a) adjusting the interpolation 3-D motion vector (Vab) to obtain an interpolation 3-D motion vector (Vi); PA1 b) forming an interpolation 3-D shape model (Mi) from the interpolation 3-D motion vector (Vi) and either the first or the second 3-D shape model (Ma, Mb); and PA1 c) forming an interpolation image from the interpolation 3-D shape model and the image shaping values from the first and second 3-D shape models. PA1 a) a 3-D motion vector adjustment means (6) for adjusting the interpolation 3-D motion vector (Vab) to obtain an interpolation 3-D motion vector (Vi); PA1 b) a first shape model transformation means (10) for forming an interpolation 3-D shape model (Mi) from the interpolation 3-D motion vector (Vi) and either the first or the second 3-D shape model (Ma, Mb)b) a first shape model transformation means for forming an interpolation 3-D shape model from the interpolation, on 3-D motion vector and the first 3-D shape model; and PA1 c) a frame interpolation means for forming an interpolation image from the interpolation 3-D shape model and the image shading values from the first and second 3-D shape models. PA1 (i) causing the 3-dimensional motion vector between the previous frame and the current frame to operate upon the 3-dimensional shape model of the previous frame; and PA1 (ii) causing the information on element shape changes between the previous frame and the current frame, said information having been transmitted from the transmitter side, to operate upon the latter stage of the aforementioned 3-dimensional shape model transformation process which transforms the 3-dimensional shape model of the previous frame. PA1 (i) causing the 3-dimensional motion vector between the previous frame and the interpolated frame, which motion vector has been obtained in the 3-dimensional motion vector adjustment process, to operate upon the 3-dimensional shape model of the previous frame; and PA1 (ii) causing the information on element shape changes between the previous frame and the interpolated frame, which information has been obtained in the aforementioned shape change information adjustment process, to operate upon the latter stage of the aforementioned 3-dimensional shape model transformation process which transforms the 3-dimensional shape model of the previous frame.
In these methods, the detection range is restricted to a 2-dimensional plane. Moreover, the unit of detection is frequently comparatively small for example a rectangular block 8 pixels.times.8 lines. An explanation will now be given of such a frame interpolation system based on block units, the assumed application being to videophones.
Referring to FIGS. 9 and 10 a frame interpolation method is shown which is based on motion vector detection in rectangular block units. In FIG. 9, encoded data is received at a decoding part 2, the decoded frame being passed to first frame memory 3 after the previously decoded frame stored in this first frame memory 3 has been moved into a second frame memory 4. Motion vectors are also received at the receiver which with the frames in frame memories 2 and 3 are the used to interpolate an intermediate frame Ti as will now be described with reference to FIG. 10.
In FIG. 10, Ta is the frame which has been transmitted from the transmitter side at the immediately preceding point in time, Tb is the frame at the current point in time, and Ti is a frame which is to be interpolated between transmitted frames Ta and Tb. Consider that the frame to be interpolated is temporally spaced between the frames Ta and Tb in the ratio a: (1-a) as shown in FIG. 10.
The frames Ta and Tb are considered to be composed of blocks B'(I) and B'(I), respectively with a set of motion vectors Vab denoting the displacements of blocks B'(I) between frames Ta and Tb. This is illustrated in FIG. 10.
A block B(I) of the interpolated frame Ti is obtained by the interpolation part 5 of FIG. 9 by means of the following equation: EQU B(I)=a.times.B'(I)+(1-a)*B"(I)
where, blocks B'(I) and B"(I) are linked by the motion vector Vab associated with the block B'(I).
A detailed description of a specific method may be found in an article by Wada: Masahiro WADA, "System for motion-compensated frame interpolation of colour moving image signals", Denshi Joho Tsushin Gakkai Ronbunshi, (B-I), Vol. J72 B I, No. 5, pp. 446-455.
U.S. Pat. No. 4,672,442 published on 9th Jun. 1987 describes a moving picture frame rate conversion system which converts the first picture signal with a first frame rate into a second picture signal with a second frame rate which differs from the first frame rate by generating an interpolation frame between two consecutive frames of the first picture signal. In this method the interpolation frame is generated by using a first picture block on the first frame and a second picture block on a second frame of the first picture signal and the second picture block is moved in position relative to the first picture block.
The prior art which has been presented is a frame interpolation method for a moving image which is based on the detection, in block units, of motion vectors in a 2-dimensional plane. However, the following problems are encountered with such a method. In conventional frame interpolation methods, the unit of motion vector detection is a block rather than a subject. Accordingly, block-shaped distortions will occur in interpolated frames if the motion vector detection in respect of one and the same body gives uneven results. Moreover, because motion vector detection is restricted to a 2-dimensional plane, it is difficult to achieve accurate detection of the motion vectors of a body which essentially includes movement in 3-dimensional space. For this reason, it is impossible to reproduce smoothly-moving images which correspond to the input images in such circumstances by the prior art methods of interpolation.
An article by Forchheimer and Kronander, titled "Image Coding--From Wave Forms to Animation" IEEE Acoustics, Speech, and Signal Processing Magazine, vol. 37 no. 12 30th Dec. 1989, pages 2003-2023 describes a method of coding face images in which a wire frame model of a face is transmitted and the surface face image synthesized assuming matte surface and one light source. Parameters are identified which indicate how the wire frame model is to move which determine at the receiver how the wire frame model of the face of the receiver is to move in order to provide a moving synthesized face image.