Digital images and image sequences occupy a great deal of memory space, making it necessary when transmitting these images, to compress them in order to avoid problems of congestion in the communication network used for this transmission. Indeed, the bit rate used for this network is generally limited.
There already exist numerous video data compression techniques. Among them, numerous video-encoding techniques, especially the H.264 technique, use techniques for predicting pixels of a current image relatively to other pixels belonging to the same image (intra prediction) or to a previous or following image (inter prediction).
More specifically, according to this H.264 technique, I images are encoded by spatial prediction (intra prediction) and P and B images are encoded by temporal prediction relatively to other I, P or B images (inter prediction) encoded/decoded by means of motion compensation for example.
To this end, the images are sub-divided into macroblocks which are then sub-divided into blocks. A block is constituted by a set of pixels. The pieces of encoding information are then transmitted for each block.
Classically, the encoding of a block is done by means of a prediction of the block and of an encoding of a prediction residue to be added to the prediction. The prediction is done by means of information already rebuilt (preceding blocks already encoded/decoded in the current image, images preliminarily coded in the context of a video encoding, etc).
After this predictive encoding, the blocks of pixels are transformed by a transform of type discrete cosine transform and then quantized. The coefficients of the quantized blocks of pixels are then scanned in a reading order making it possible to exploit the large number of zero coefficients in the high frequencies, and are then encoded by an entropic encoding.
According to the H.264 technique for example, for each block the following are encoded:                the encoding type (intra prediction, inter prediction, default or skip prediction for which no information is transmitted to the decoder);        the type of partitioning;        the information on the prediction (orientation, reference image, etc);        the motion information if necessary;        the encoded coefficients corresponding to the transform residue after quantization and entropic encoding;        etc.        
The decoding is done image by image and for each image it is done macroblock by macroblock. For each macroblock, the corresponding elements of the stream are read. The inverse quantification and the inverse transformation of the coefficients of the blocks of the macroblock are done. Then, the prediction of the macroblock is computed and the macroblock is rebuilt by adding the prediction to the decoded prediction residue.
These compressive encoding techniques are efficient but are not optimal for compressing images comprising areas having similar characteristics such as a homogenous texture.
In particular, in the H.264/MPEG-4 AVC standard, the spatial prediction of a block in an image relatively to a block in this same image is possible only if this other block is a neighboring block of the block to be predicted and being located in certain predetermined directions relatively to this one, i.e. generally above and to the left, in a neighborhood known as a “causal” vicinity. Similarly, the prediction of the motion vectors of a block of an image is a causal prediction relatively to the motion vectors of neighboring blocks.
This type of prediction therefore does not make it possible to take advantage of the textural similarity of blocks of disjoined areas with the same texture, or of blocks that are far apart in an area with the same texture. In other words, this type of technique does not enable blocks possessing common characteristics to be addressed simultaneously as a single entity. Moreover, the motion of homogenous texture areas from one image to another is not used optimally either: indeed, the temporal prediction according to the H.264 /MPEG-4 AVC standard makes it possible to exploit the motion of a block from one image to another but not the membership of this block in a area having homogenous motion.
To resolve this problem, certain techniques known as regional encoding techniques have proposed to segment the images of a video sequence so as to isolate areas of homogenous motion and texture in these images before encoding them. These areas define objects in these images on which it is chosen for example to use a finer encoding or on the contrary a coarser encoding.
However, these regional encoding techniques make it necessary to send to the encoder that is the destination of the video sequence a segmentation map computed for each image in the encoder which sends this video sequence. This segmentation map is very costly in terms of memory space because the boundaries of this segmentation map generally do not correspond to the boundaries of the blocks of pixels of the segmented images. Furthermore, the segmentation of a video sequence into areas of arbitrary shapes is not deterministic: the boundaries of the segmentation map generally do not correspond to the boundaries of the real objects that this map attempts to sub-divide in the images of the video sequence. Because of this, only the representation and transmission of such segmentation maps have been standardized (MPEG-4 standard part 2), and not their production.
The international patent application n° PCT/FR2009/050278 filed on 20 Feb. 2009 on behalf of the present Applicant proposes a video compression technique using “clusters of blocks” making it possible to resolve certain of these drawbacks. More specifically, according to this technique, certain macroblocks of the image sequence are grouped together in clusters when they share a common piece of information, for example a piece of information on motion. The common piece of information is then encoded only once for all the macroblocks of the cluster and, at decoding, the macroblocks belonging to the cluster inherit information on motion from the decoded cluster. This technique enables a gain in compression by preventing the encoding of redundant information.
However, according to this technique, the signaling of clusters in the stream can prove to be costly in terms of bit rate.
Indeed, the clusters may contain any macroblocks of an image or of a group of images.
Furthermore, since certain blocks or sub-blocks of a macroblock may be excluded from a cluster, it is necessary to report, for each macroblock, whether some of its blocks or sub-blocks are excluded.
There is therefore a need for a new image encoding/decoding technique making it possible to resolve at least certain drawbacks of the prior art and making it possible to optimize the signaling of the data stream and hence the transmission bit rate.