The invention can be applied in all fields of imaging and image processing. The following can be mentioned as non-restrictive examples:                video encoding and/or decoding especially in order to improve the performance of an MPEG-4 or H263+ type encoder/decoder or any other video encoder/decoder;        medical imaging;        the conversion of standards, such as the PAL/NTSC conversion for example.        
The invention can be applied especially, but not exclusively, to the temporal interpolation of frames and/or to the bi-directional encoding of video sequences, with a view especially to overcoming the problems of bandwidth saturation and adaptability to terminals of varied types.
Indeed, it is now one of the main goals of operators in the video field to enable a user to view moving sequences of satisfactory quality, especially sequences in which the successive images follow one another in a fluid manner. Now, the problem of the fluidity of the video sequences is highly dependent on that of the saturation of the transmission networks and of the capacity (especially the storage capacity) of the display terminals available to the user.
In order to meet the requests of the users, and overcome the problems of bandwidth saturation, it is therefore necessary to produce video streams in a wide range of bit rates. However, the video sequences transmitted at low or medium bit rates generally appear to be jerky when they are viewed.
The method envisaged therefore was to implement a frame frequency interpolation during the decoding of the bit streams transmitted, and/or a storage of additional information during the encoding in bi-directional mode, in order to generate video sequences of greater fluidity, as a function of the display terminal, the transmission network and/or a request from a user.
Various techniques of frequency interpolation, relying on the estimation of a motion field, have been envisaged but, to date, none of them performs satisfactorily.
It may be recalled that the motion between two images of the sequence can be defined as a 2D vector characterizing the difference of positions between a pixel of one image and the homologous pixel in the other image, it being assumed that these pixels correspond to the same physical point in each of the two images. The assignment of a displacement vector to each pixel in the image therefore defines a dense motion field in the image. This motion field is also called an optical flow.
The methods of motion field estimation used for the interpolation of frame frequency are of two types:                enumerative methods, also called matching methods;        differential methods.        
The enumerative methods rely on the subdivision of the domain of an image into finite-sized regions by regular or matched partitioning. Then, a motion model is defined. This model may be, for example, affine, translational, polynomial, etc. and the procedure involves successive iterations in which a set of distinct values, included in a predetermined interval, is applied to the parameters of this model.
The differential methods, for their part, consist in estimating a dense motion field by assigning a motion vector to each pixel of the image considered. They implement, firstly, methods of differential computation and, secondly, optimization techniques relying for example on gradient descents. These methods include especially methods based on the equation of the optical flow, pel-recursive methods (pel standing for “Physical Picture Element”), parametric methods or robust estimators for example.
Much research work has been undertaken in order to improve existing techniques of image interpolation. Of this work, reference may be made to the method proposed by Ciro CAFFORIO and Fabio ROCCA (“Motion Compensation Image Interpolation”, IEEE Transactions on communications, Vol. 38, 2, February 90), consisting of the use of a pel-recursive estimator for motion compensation.
It may be recalled that pel-recursive methods proceed by prediction-correction of the motion vectors associated with each of the chosen pixels of an image, in a predetermined path direction. The prediction may rely on the value of the motion vector of the previous pixel, or on a linear combination of the motion vectors of the pixels neighboring the pixel considered. The correction is then based on a minimization of the DFD (Displacement Frame Difference) by a method of gradient descent.
One drawback of this prior art technique is that the algorithm implemented by CAFFORIO and ROCCA suffers from high interaction between the recursive directions and the direction of the displacement to be estimated.
To overcome this effect, the use of four distinct recursive directions was envisaged. Thus, the disturbing effects were minimized. However, the performance of this technique remains inadequate.
Jack NIEWEGLOWSKI (in “Motion Compensated Video Interpolation Using Digital Image Warping”, IEEE ICASSP'94) has proposed a method relying on the use of a geometrical conversion, combined with an improved technique known as “block matching”.
A method of this kind can be used to take account of changes in scale but has the drawback of low robustness owing to the decorrelation between geometrical transformation and block matching.
Q. WANG and L. J. CLARK (“Motion Estimation and Compensation for Image Sequence Coding”, Signal Processing: Image Communication, Vol. 4, No. 2, April 92, pp. 161-174) subsequently envisaged the implementation of a parametrical algorithm, relying on the modeling of the motion of any region of an image by an a priori defined model.
One drawback of this prior art technique is that it is not suited to the estimation of the regions of heavy motion.
To overcome the different problems encountered by these motion estimation techniques, it has also been proposed to implement algorithms for the estimation of hierarchical motion, such as those described by L. Böröcsky and P. Csillag (“Multiresolution Motion Field Estimation for Wavelet Video Coding and Frame Interpolation”, Final Workshop, Vigo, Spain, October 94, pp. 163-166).
Hierarchical motion estimation algorithms of this kind, like the prior art motion estimation methods described here above, come up against the problem of the detection of the occlusion zones (zones of crowding, concealment or uncovering of the mesh) within the images of the video sequence considered. Indeed, when different planes and objects overlap in a scene, zones of occlusion appear, generating lines of discontinuity within the mesh associated with the image.
The technique most commonly used to manage the appearance and/or the disappearance of occlusion zones within an image is the use of raytracing. Such a technique consists of the association, with each of the constituent pixels of the image, of a label pertaining to the motion vector associated with this pixel.
Thus a pixel receiving at least two motion vectors carries the label “occlusion”, which means that it covers at least two pixels present in the preceding image of the sequence. A pixel receiving a single motion vector carries the label “normal”, thus indicating that it is present in the preceding image and in the current image. Finally, a pixel receiving no motion vector carries the label “gap”, so as to indicate that it belongs to an uncovering zone of the image.
This kind of technique is described especially in J. Ribas-Corbera and J. Sklansky, “Interframe Interpolation of Cinematic Sequences”, Journal of Visual Communication and Image Representation, Vol. 4, No. 4, December 93, pp. 392-406 and in B. Chupeau and P. Salmon “Motion Compensating Interpolation for Improved Slow Motion”, Thomson-CSF/Laboratoires electroniques de Rennes, France.
However, this prior art technique exploits this label information only to make an a posteriori correction of the interpolated images, but does not take account of them during the motion estimation enabling the constructing of such images.
The invention is aimed especially at overcoming these drawbacks of the prior art.
More specifically, it is a goal of the invention to provide a technique for the constructing of an image interpolated between a preceding image and a following image in a moving sequence, to improve the impression of fluidity during the display of the sequence.
Another goal of the invention is to implement a technique for the constructing of an interpolated image that is simple and cost little to implement.
Yet another goal of the invention is to provide a technique for the constructing of an interpolated image that is robust and adapted to the regions of heavy motion.
It is also a goal of the invention to implement a technique for the constructing of an image interpolated within a video sequence enabling the management of the appearance and/or the disappearance of objects that correspond to occlusion zones (reversal zones, uncovering zones, crowding zones) within a mesh associated with the images of the sequence.
It is yet another goal of the invention to provide a technique for the constructing of an interpolated image, enabling a user with a terminal of limited storage capacity to view a fluid video sequence.
It is yet another goal of the invention to implement a technique for the constructing of an interpolated image, enabling a user to view a fluid video sequence, despite a saturation of the bandwidth of the network by which it is transmitted.
It is also a goal of the invention to provide a technique for the constructing of an interpolated image that can be integrated into any video encoder/decoder in order to improve its performance, especially in terms of visual fluidity.