Digital images and image sequences occupy a great deal of memory space, making it necessary when transmitting these images, to compress them in order to avoid problems of congestion in the communications network used for this transmission as the bit rate used for this network is generally limited. This compression is also desirable for the storage of these pieces of data.
There already exist numerous known video data compression techniques. Among them, numerous video-encoding techniques, especially the H.264 technique, use techniques of spatial or temporal prediction of groups of blocks of pixels of a current image relatively to other groups of blocks of pixels belonging to the same image or to a previous or following image.
More specifically, according to this H.264 technique, I images are encoded by spatial prediction (intra prediction) and P and B images are encoded by temporal prediction (inter prediction) relatively to other I, P or B images encoded/decoded by means of motion compensation.
These images are sub-divided into blocks comprising a set of pixels (for example 8×8). For each block, there is encoded a residual block, also called a prediction residue, corresponding to the original block minus a prediction. After this predictive encoding, the blocks of pixels are transformed by a discrete cosine transform type of transform and then quantized. The coefficients of the quantized blocks of pixels are then scanned in a reading order making it possible to exploit the large number of zero coefficients in the high frequencies, and are then encoded by an entropic encoding.
According to the H.264 technique for example, for each block the following are encoded:                the encoding type (intra prediction, inter prediction, default or skip prediction for which no information is transmitted to the decoder);        the type of partitioning;        the information on the prediction (orientation, reference image, etc);        the motion information if necessary;        the encoded coefficients;        etc.        
The decoding is done image by image and for each image it is done block by block. For each block, the corresponding elements of the stream are read. The inverse quantification and the inverse transformation of the coefficients of the blocks are done. Then, the prediction of the block is computed and the block is rebuilt by adding the prediction to the decoded prediction residue.
The H.264/MPEG-4 AVC standard thus proposes an encoding implementing a motion vector prediction defined from the median of the components of the motion vectors of the neighboring blocks. For instance, the motion vector used on a block encoded in “inter” mode is encoded by means of a predictive encoding such as the following:                at a first stage, a prediction vector for the motion vector of the block considered is set up. Typically, such a vector, known as a median predictor, is defined from the median values of the components of already encoded blocks;        at a second stage, the prediction error, i.e. the difference between the motion vector of the current block and the previously established prediction vector is encoded.        
An extension of this technique of motion vector prediction is proposed by J. Jung and G. Laroche in the document <<Competition-Based Scheme for Motion Vector Selection and Coding>>, ITU-T VCEG, AC06, July 2006.
This technique consists in placing, for example, several predictors or prediction candidate vectors (beyond the median predictor used by AVC) in competition and indicating which vector, of the set of candidate prediction vectors, is the vector effectively used.
These compressive encoding techniques are efficient but are not optimal for compressing images comprising areas of homogenous texture. Indeed, in the H.264/MPEG-4 AVC standard, the spatial prediction of a block in an image relatively to another block in this same image is possible only if this other block is a neighboring block of the block to be predicted and is located in certain predetermined directions relatively to this one, block, i.e. generally above and to the left in a neighborhood known as a “causal” vicinity. Similarly, the prediction of the motion vectors of a block of an image is a causal prediction relatively to the motion vectors of neighboring blocks.
A technique of encoding using extended block sizes has been proposed by P. Chen, Y. Ye and M. Karczewicz in the document “Video coding Using Extended Block Sizes”, ITU-T COM16-C123, January 2009, in an extension of a hybrid video encoder using blocks (e.g. the AVC method).
According to this document, the use of extended-size blocks makes it possible, in motion compensation encoding modes, to limit the cost of encoding motion information by proposing the encoding of a motion vector for a homogenous zone of greater extent than a block of predetermined size. Besides, the use of an extended-size block enables the use of a transformation of extended support to be applied to the motion compensation residue. An extended transformation of this kind also makes it possible to gain compression through a greater decorrelation capacity, but also efficient signaling to signal zero residues. The extended size blocks are especially beneficial for the encoding of high-resolution video sequences and are traditionally placed in competition with classic-sized blocks.
Thus, the use of such an extended-size block to represent the pieces of motion information makes it possible to gain in compression efficiency because only one motion vector is encoded for the extended-size block and is well suited to zones where the motion is constant. The motion vector is therefore supposed to be constant within the extended-size block. However, this uniform motion mode is restrictive in the case of motions that are not constant but have similar characteristics. For example, this technique for a orange-colored uniform-textured zone will represent a constant motion on the homogenous zone, whereas certain blocks or sub-blocks, a sub-block being a subset of a block of pixels, of this zone can show disparities of motion linked for example to the maintenance of a fabric.
The solution typically proposed to overcome this problem implements a sub-division of extended-size blocks into sub-blocks, a sub-block being a subset of a block of pixels, and the defines for each sub-block a motion vector. However, this solution amounts to encoding as many motion vectors as there are sub-blocks in the extended-size block and proves to be very costly in terms of signaling.
The inventors have therefore identified the need for a novel technique for obtaining especially better performance in encoding an extended size block while at the same time restricting the cost of the signaling, and thus providing for better compression efficiency while ensuring faithful representation of the movement to be encoded.