The present invention relates to a method of coding motion information associated to a video sequence divided into successive frames, comprising the steps of:
subdividing the current frame into bidimensional blocks;
for each current block of said current frame, selecting in a previous frame, by means of a block-matching algorithm, a shifted block as the prediction of said current block, the motion vector between said shifted and current blocks being the predicted vector associated to said current block and all the motion vectors similarly predicted for a whole current frame constituting a motion vector field associated to said current frame;
for each current frame, coding by means of a differential encoding technique, including for each motion vector to be coded a predictor associated to it, the motion information constituted by said associated motion vector field.
The invention also relates to a corresponding encoding device, to a method of decoding motion information coded according to this coding method, and to a corresponding decoding device. In the detailed description of one implementation of the invention, that will be given later, the bidimensional blocks are for instance macroblocks, as defined in the standards of the MPEG family.
The coding schemes proposed for digital video compression generally use motion estimation and compensation for reducing the temporal redundancy between the successives frames of the processed video sequence. In such methods, a set of motion vectors is determined at the encoding side and transmitted to the decoder. Most video coding standards use for the motion estimation operation the so-called block matching algorithm (BMA), described for example in the document xe2x80x9cMPEG video coding: a basic tutorial introductionxe2x80x9d, S. R. Ely, BBC Research and Development Report, 1996. Said algorithm, depicted in FIG. 1, tries to find for each block Bc of a current image It the block Br of a previous reference image Itxe2x88x921, that best matches, said previous block being only searched in a limited area of this previous image (or search window SW) around the position of the bloc Bc. The set of motion vectors thus determined in the encoder for each block Bc of the current frame must be sent to the decoder.
In order to minimize the bitrate needed to transmit the motion vectors, these vectors are generally differentially encoded with reference to previously determined motion vectors (or predictors). More precisely, the encoding of the motion vectors describing the motion from previous blocks Br to current ones BC is realized by means of a predictive technique based on previously transmitted spatial neighbours. The motion vectors are differenced with respect to a prediction value and coded using variable length codes.
It is a first object of the invention to propose a method for coding motion vectors that includes an improved prediction of these motion vectors.
To this end, the invention relates to a coding method such as defined in the introductory part of the description and which is moreover characterized in that, for each current block, the predictor used in the subtraction operation of said differential encoding technique is, a spatio-temporal predictor P obtained by means of a linear combination defined by a relation of the type:
P=xcex1xc2x7S+xcex2xc2x7T
where S and T are spatial and temporal predictors respectively, and (xcex1, xcex2) are weighting coefficients respectively associated to said spatial and temporal predictors.
In an advantageous implementation of the invention, the criterion for the choice of the weighting coefficients is to minimize the distortion between the motion vector C to be coded and its predictor P in the least means square sense, i.e. to minimize the following operator:
F=xcexa3[Cxe2x88x92(xcex1xc2x7S+xcex2xc2x7T)]2,
where the summation is done on the entire motion vector field, i.e. for all the blocks of the current frame.
Preferably, the spatial predictor is obtained by applying a median filtering on a set of motion vector candidates chosen in the neighbourhood of the current block, said set of motion vector candidates comprising three motion vector candidates if a spatial prediction compliant with the MPEG-4 standard is required.
The temporal predictor may be determined either by re-using the spatial predictor already determined for the motion vector of the current block to point to the block inside the previously transmitted motion vector field, or by keeping in memory the spatial predictor candidates used during the computation of the spatial predictor, pointing with them from the corresponding blocks in the current image to blocks of the previous image whose motion vectors may be viewed also as spatial predictors for the temporal predictor to be determined, and implementing a median filtering of these spatial predictors inside the previous motion vector field, the obtained result being said temporal predictor to be determined.
It is another object of the invention to propose a method of decoding motion information coded by means of said coding method.
To this end, the invention relates to a method of decoding motion information corresponding to an image sequence and which has been previously, before a transmission and/or storage step, coded by means of a coding method comprising the steps of:
subdividing the current image into bidimensional blocks;
for each current block of the current image, selecting in a previous image, by means of a block-matching algorithm, a shifted block as the prediction of said current block, the motion vector between said shifted and current blocks being the predicted vector associated to said current block and all the motion vectors similarly predicted for a whole current image constituting a motion vector field associated to said current image;
for each current image, coding the motion information constituted by said associated motion vector field, the motion vector C to be coded for each current block being approximated by a spatio-temporal predictor P obtained by means of a linear combination defined by a relation of the type:
P=xcex1xc2x7S+xcex2xc2x7T
where S and T are spatial and temporal predictors respectively, and (xcex1, xcex2) are weighting coefficients respectively associated to said spatial and temporal predictors, said decoding method being characterized in that it comprises two types of decoding step:
for the first motion vector field of the sequence, a first type of decoding step only based on spatial predictors;
for the other motion vector fields, a second type of decoding step comprising a computation of the spatio-temporal predictor P on the basis of the motion vectors of the previous motion vector field already decoded, spatial predictors defined in the neighbourhood of the current motion vector to be decoded, and the transmitted weighting coefficients xcex1 and xcex2.