There already exist many known video data compression techniques. These include numerous video encoding techniques that use a block-wise representation of the video sequence such as for example techniques implementing video compression standards laid down by the MPEG organization (MPEG-1, MPEG-2, MPEG-4 part 2, etc) or the ITU-T (H.261 . . . H.264/AVC). Thus, in the H.264 technique, each image can be divided into slices which are themselves divided into macroblocks which are then sub-divided into blocks. A block is constituted by a set of pixels. According to the H.264 standard, a macroblock is a square block of a size equal to 16×16 pixels which can be re-divided into blocks sized 8×8, 16×8 or 8×16, the 8×8 blocks being then capable of being re-divided into sub-blocks sized 4×4, 8×4 or 4×8.
According to the prior-art techniques, the macroblocks or the blocks can be encoded by intra-image or inter-image prediction. In other words, a macroblock or block can be encoded by:                a temporal prediction, i.e. with reference to a reference block or macroblock belonging to one or more other images; and/or        a spatial prediction as a function of blocks or macroblocks of the current image.        
In the case of a spatial prediction, a current block can be predicted only from blocks that have been previously encoded by means of a technique of directional extrapolation of the encoded-decoded values of texture on the neighboring blocks. These blocks are said to belong to the “causal neighborhood” of the current block, comprising the blocks situated before the current block in a predetermined direction of scanning of the blocks in the image.
Thus, to predict a macroblock or a block on the basis of its causal neighborhood, nine modes of intra-prediction are used according to the ITU-T H.264 standard. These modes of prediction comprise eight modes corresponding to a given orientation for copying the pixels on the basis of the previously encoded-decoded neighboring blocks (the vertical, horizontal, diagonal down left, diagonal down right, vertical right, vertical left, horizontal up and horizontal down orientations) and one mode corresponding to the average of the pixels adjacent to the block from the neighboring blocks.
Unfortunately, spatial prediction is insufficient on its own, whence the need to encode a prediction error. Thus, for each block, a residual block is encoded. This residual block is also called a prediction residue, corresponding to the original block minus a prediction. To this end, the coefficients of this block are quantified after a possible transform (for example a DCT or discrete cosine transform), and then encoded by an entropy encoder. At the encoder, the chosen mode of prediction is the one used to obtain the most appropriate compromise between bit rate and distortion.
Thus, in intra-encoding mode, the values of texture of the current block are predicted on the basis of the encoded-decoded values of texture of the neighboring blocks, and then a prediction residue gets added to this prediction.
In the case of a temporal prediction, the ITU-T H.264 standard uses a shift in the sense of motion to predict a block or a macroblock from its temporal neighborhood. The motion vector is then encoded and transmitted.
Alternative methods of intra-prediction have been proposed recently, based especially on correlations between neighboring pixels.
For example, the article from Tan et al, Intra Prediction by Template Matching, presents a texture synthesis technique of this kind based on a method of intra-prediction, also called “template matching”. A simplified scheme of this technique is illustrated in FIG. 1.
This technique is used to synthesize a pixel p (or group of pixels) in a target zone C of the image, in taking account of a source zone S of the same image. It is based on the correlations between neighboring pixels. Thus, the value of each pixel p of the target zone C is determined by comparing (11) the pixels N(p) belonging to the causal neighborhood of the pixel p, defining a “template” or “mask” of the pixel p with all the neighbors of the source zone S. The mask N(p) therefore consists of pixels that are previously encoded/decoded, and this prevents the propagation of errors.
If a region of the source zone S similar to the template defined by N(p) is found, then the pixel (or group of pixels) of the source zone S having the most similar neighborhood is allocated (12) to the pixel p of the target zone C.
Classically, the region closest to the template in the source image, corresponding to a region similar to the template, is chosen by criteria of mean squared error minimization or absolute error minimization using the following formula:
      N    ⁡          (              q        ′            )        =      arg    ⁢                  ⁢                  min                              N            ⁡                          (              q              )                                ∈          S                    ⁢              d        ⁡                  (                                    N              ⁡                              (                p                )                                      ,                          N              ⁡                              (                q                )                                              )                    with:                q being a pixel of the source zone S;        d being a function measuring the distance between two masks according to the minimization criterion chosen.        
In this technique, the pixels are synthesized in a fixed and predetermined order, generally from top to bottom and from left to right (i.e. in the raster scan order).
The classic technique of template matching is used to synthesize textures constituted by random structures.
However, it is not very efficient when it is applied to natural images formed by combined textures and contours.
The template matching technique has also been extended to the encoding of blocks of an H.264 encoder by Wang et al, as described in “Priority-based template matching intra prediction”. A macroblock can thus be encoded/decoded using blocks (formed by 4×4 pixels according to the H.264 standard) rebuilt by the “template matching” technique.
As illustrated in FIG. 2, in the application by Wang et al, the blocks are classically processed in a raster scan order, i.e. a left to right and top to bottom order of scanning the blocks in the macroblock. Consequently, the macroblock is encoded/decoded (or rebuilt) block by block. The processing of the blocks one after the other enables the use the pixels of a previously encoded/decoded block as “source zone” for a current block, according to the scanning order.
Unfortunately, this technique is not optimal because it forces the rebuilding of the macroblock to be done on the basis of 4×4-pixel blocks.
There is therefore a need for new image encoding/decoding techniques implementing a “template matching” type of synthesis technique, enabling these prior-art techniques to be improved.