The invention relates to a method of encoding a source sequence of pictures comprising the steps of:
dividing the source sequence into a set of groups of pictures, each group of pictures comprising a first frame, hereafter referred to as I-frame, followed by at least a pair of frames, hereafter referred to as PB-frames;
dividing each I-frame and PB-frame into spatially non-overlapping blocks of pixels;
encoding the blocks of said I-frame, hereafter referred to as the I-blocks, independently from any other frame in the group of pictures;
deriving motion vectors and corresponding predictors for the blocks from the temporally second frame of said PB-frame, hereafter referred to as the P-blocks, based on the I-blocks in the previous I-frame or the P-blocks in the previous PB-frame;
predictively encoding the P-blocks based on the I-blocks in the previous I-frame or the P-blocks in the previous PB-frame;
predictively encoding the blocks of the first frame of said PB-frame, hereafter referred to as the B-blocks.
The invention also relates to a system for carrying out said method.
The invention may be used, for example, in video coding at a very low bit rate.
Standardization of low bitrate video telephony products and technology by the ITU (International Telecommunication Union) are compiled in the standards H.320 and H.324. These standards describe all the requirements to be satisfied for the different components audio, video, multiplexer, control protocol and modem. H.320 is dedicated to videoconferencing or videophony over ISDN (Integrated Services Digital Network) phone lines. H.324 is aimed at videophony over GSTN (Global Switch Telephonic Network) analog phone lines. The two standards both support Recommendation H.263 for video-coding, which describes compression of low bit rate video signals. Recommendation H.263 comprises four optional modes for a video coder. One of these optional modes is called the PB-frames mode, which gives a way of encoding a PB-frame. A second version of Recommendation H.263, called H.263+, was developed to improve the image quality and comprises some new options. Thus, an option called Improved PB-frames mode, which is an improvement of the original PB-frames mode, provides a new way of encoding a PB-frame.
A sequence of picture frames may be composed of a series of I-frames and PB-frames. An I-frame comprises a picture coded according to an Intra mode, which means that an I-frame is coded using spatial redundancy within the picture without any reference to another picture. A P-frame is predictively encoded from a previous P or I-picture. Thus, when coding a P-picture, temporal redundancy between the P-picture and a previous picture used as a picture reference, which is mostly the previous I or P-picture, is used in addition to the spatial redundancy as for an I-picture. A B-picture has two temporal references and is usually predictively encoded from the previous reconstructed P or I-picture and the P-picture currently being reconstructed. A PB-frame comprises two successive pictures, a first B-frame and a subsequent P-frame, coded as one unit.
A method of coding a PB-frame in accordance with the PB-frame mode is illustrated in FIG. 1. It shows a PB-frame composed of a B-frame B and a P-frame P2. The B-frame B is surrounded by a previous P-picture P1 and the P-picture. P2 currently being reconstructed. There is shown in this example a P-picture P1; P1 may also be a I-picture and serves as a picture reference for the encoding of the P-picture P2 and the B-picture B. A B-block of the B-frame, in the PB-frame mode, can be subjected to forward or bidirectional predictive encoding. The fact that a B-block is subjected to forward predictive coding is based on the previous I or P-picture P1 and the fact that a B-block is subjected to bidirectional predictive coding is based on both the previous I or P-picture P1 and the P-picture P2 currently being reconstructed. A set of motion vectors MV is derived for the P-picture P2 of the PB-frame with reference to the picture P1. In fact for each macro block of P2, a macro block of P1 is associated by block matching and a corresponding motion vector MV is derived. Motion vectors for the B-block are derived from the set of motion vectors previously derived for P1. Therefore, a forward motion vector MVf and a backward motion vector MVb are calculated for a B-block as follows:
MVf=(TRbxc3x97MV)/TRdxe2x80x83xe2x80x83(1)
MVb=((TRbxe2x88x92TRd)xc3x97MV)/TRdxe2x80x83xe2x80x83(2)
MVb=MVfxe2x88x92MVxe2x80x83xe2x80x83(3)
where
TRb is the increment in the temporal reference of the B-picture from the previous P-frame P1, and
TRd is the increment in the temporal reference of the current P-frame P2 from the previous I or P-picture P1.
FIG. 1 shows a macro block AB of the B-picture. This macro block AB has the same location as a macro block A2B2, Prec, of P2 that was previously reconstructed. A forward motion vector MV is associated to the macro block A2B2 from a macro block A1B1, which belongs to P1. A forward motion vector MVf and a backward motion vector MVb, both associated to AB are derived from MV as shown in the relations (1) to (3). The macro blocks of P1 and P2 associated to the macro block AB by the forward vector MVf and by the backward vector MVb are respectively K1M1 and K2M2, as illustrated in FIG. 1.
The choice between bidirectional prediction and forward prediction is made at the block level in the B-picture and depends on where MVb points. Then a MB part of the B-block AB, for which MVb points inside Prec, is bidirectionally predicted, and the prediction for this part of the B-block is:
MB(i,j)=[A1M1(i,j)+A2M2(i,j)]/2xe2x80x83xe2x80x83(4)
where i and j are the spatial coordinates of the pixels.
An AM part of the B-block AB, for which MVb points outside Prec, is forward-predicted and the prediction for this part of the B-block AB is:
AM(i,j)=K1A1(i,j)xe2x80x83xe2x80x83(5)
An improved method of encoding a PB-frame according to the PB-frame mode is described in European Patent Application EP 0 782 343 A2. It discloses a predictive method of coding the blocks in the bidirectionally predicted frame, which method introduces a delta motion vector added to or subtracted from the derived forward and backward motion vectors respectively. The described method may be relevant when the motion in a sequence of pictures is non-linear, however, it is totally unsuitable for a sequence of pictures where scene-cuts occur. Indeed, when there is a scene cut between a previous P-frame and the B-part of a PB-frame, bidirectional and forward prediction give an erroneous coding. Besides, the implementation of the delta vector, which is costly in terms of CPU burden, may result in unnecessary, expensive and complicated calculations.
It is an object of the invention to improve the efficiency of existing coding methods, while decreasing CPU burden, and, more particularly, to provide an efficient strategy or method which permits to make the most suitable choice among prediction modes for the coding of a given macro block of a B-frame.
Thus, the encoding of the B-blocks comprises for each B-block in series the steps of:
deriving the minimum of the sum of absolute difference for the B-block based on the I-blocks in the previous I-frame or on the P-blocks in the previous PB-frame, hereafter referred to as SADf;
deriving the sum of absolute difference for the B-block and the P-block in the P-frame of the PB-frame with the same location as the B-block, hereafter referred to as
when SADf is greater than SADb, predictively encoding the B-block based on the P-blocks of the second frame of the PB-frame;
when SADf is lower than SADb:
deriving, for the P-block with the same location as the B-block, the difference between said motion vector and said predictor;
when the difference obtained is greater than a predetermined threshold, predictively encoding the B-block based on the I-blocks or the P-blocks in the previous PB-frame;
when the difference obtained is smaller than the predetermined threshold, predictively encoding the B-block based on the P-blocks of the second frame of the PB-frame and the I-blocks or the P-blocks in the previous PB-frame.
For the coding of a B-block, the method claimed gives a strategy for the choice of the prediction mode to be used among the forward, backward and bidirectional modes. The choice is based on SAD (Sum of Absolute Difference) calculation and motion vector coherence. The strategy is based on a specific order in the comparisons of the SAD values for the three prediction modes and the introduction of motion coherence. This motion vector coherence criterion permits to avoid the calculation of SADbidirectional for the choice of bidirectional prediction, which is CPU-consuming. The proposed method has the main advantage of not being in favor of bidirectional prediction and allows to perform backward prediction when there is no motion. Thus, the method leads to a suitable choice of prediction mode for a given block of a B-frame.
In a preferred embodiment of the invention, a method according to the invention may either be carried out by a system constituted by wired electronic circuits that may perform the various steps of the proposed method. This method may also be partly performed by means of a set of instructions stored in a computer-readable medium.