The invention relates to a method of encoding a sequence of pictures, each picture being partitioned into non-overlapping blocks of pixels.
The invention also relates to a filtering device for carrying out such a method.
The International Organization for Standardization has defined, in the MPEG-4 standard, requirements to be satisfied for devices dealing with interactive multimedia applications. This standard, first, defines a concept of Video Object Plane (VOP) as an entity directly accessible from the bitstream. A VOP may be a basic graphic or an audio primitive. The encoding of a picture therefore consists of subsequent encoding of VOPs present in the picture.
A sequence of pictures may be composed of I frames, P-frames and B-frames. An I-frame is coded according to an Intra mode using spatial redundancy within the picture without any reference to another frame. In addition to the spatial redundancy as for an I-picture, the coding of a P-frame uses temporal redundancy between the P-picture and a previous picture used as a picture reference, which is mostly the previous I or P-picture. A B-picture has two temporal references and is usually predictively encoded from a previous P or I-picture and the next I or P-picture already encoded and reconstructed.
The MPEG-4 standard defines four prediction modes for the encoding of a picture with reference to a past reference frame and a future reference frame. A first prediction mode is the direct coding. This prediction mode uses the bidirectional motion compensation derived from the H.263 approach which employs motion vectors derived for macroblocks of the future reference frame and scales them to derive forward and backward motion vectors for blocks in said picture to be encoded. A second prediction mode is the forward mode which uses forward motion compensation in the same manner as in MPEG-1/2 with the difference that a VOP is used for prediction instead of a picture. A third prediction mode is the backward coding which uses backward motion compensation in the same manner as in MPEG-1/2 with the difference that a VOP is used for prediction instead of a picture. A last prediction mode is the bidirectional coding which uses interpolated motion compensation in the same manner as MPEG-1/2 with the difference that a VOP is used for the prediction instead of a picture.
The MPEG-4 Video Verification Model version 10.0 ISO/IEC JTC1/SC29/WG11 of February 1998 discloses a strategy for the decision of a particular prediction mode among the four possible ones for the encoding of a B-VOP. For a B-block, an estimation of the error of the prediction, the sum of absolute differences (SAD) in this document, is derived for the four prediction modes and the prediction mode giving the smallest SAD is chosen for the encoding of the B-block. This proposed strategy has the main disadvantage of being very computational.
It is therefore an object of the invention to provide a more efficient method of coding, giving a good trade-off between speed and coding quality.
To this end, a method such as described in the introduction, comprises, for a block belonging to a picture to be encoded on the basis of a past reference frame and a future reference frame, hereafter referred to as a block to be encoded, at least the steps of:
deriving for a block in the future reference frame with the same location as the block to be encoded, an optimum motion vector on the basis of the past reference frame and a corresponding optimum prediction block in the past reference frame;
deriving the sum of absolute differences between the block in said future reference frame with the same location as the block to be encoded and the optimum prediction block in the past reference frame, hereafter referred to as SADref;
deriving for the block to be encoded, a forward motion vector (MVf) on the basis of the optimum motion vector and a corresponding forward prediction block in the past reference frame;
deriving the sum of absolute differences between the block to be encoded and the forward prediction block, hereafter referred to as SADf;
deriving for the block to be encoded, a backward motion vector on the basis of the optimum motion vector and a corresponding backward prediction block in the future reference frame;
deriving the sum of absolute differences between the block to be encoded and the backward prediction block, hereafter referred to as SADb;
encoding the block to be encoded according to a direct prediction mode if one of the three following conditions is satisfied:
the spatial coordinates of the optimum motion vector are within a given range;
the deviation of SADref towards SADb is smaller than a given threshold;
the deviation of SADref towards SADf is smaller than a given threshold.
Such a method favors the direct prediction mode when justified so as to avoid the computation of the forward, the backward and the bidirectional prediction mode when possible. Compared to a method proposed by the prior art, when the direct mode is chosen, there is no needed for a prior calculation of the sum of absolute differences associated to the direct mode, which is very computational. An advantage of the invention is a greater speed in the process of deciding an adapted prediction mode, because of the reduction of calculation costs.