In the case of the widely used hybrid video encoder, we can distinguish mainly two classes of prediction (or encoding): intra-prediction methods and inter-prediction methods. Intra prediction uses the correlation between pixels belonging to a same spatial neighborhood (i.e. within a same image, the intra image) while inter prediction uses the correlation between pixels belonging to a same temporal neighborhood (i.e. between several images that are temporal neighbors, known as the inter image). Each image is processed sequentially and generally in blocks (processing by blocks is non-exhaustive and the present description holds true for any image portion whatsoever). For each block, after intra or inter prediction of the source samples, the residual samples (derived from the difference between the source samples and the predicted samples) are successively converted, quantized and encoded by means of entropy encoding.
FIG. 1 presents a generic block diagram of an example of a known hybrid video encoder (i.e. forming part of the prior art). The dashed arrows referenced 11 and 12 represent the criteria of optimization (on the one hand “R=RT” and on the other hand “MIN(D(s,ŝ))”, as described in detail further below) used by a decision module (also called a “decision-making module”) 10. The dashed arrows referenced DS0 and DS6 represent the output decisions sent to the different encoding modules controlled by the decision-making module 10. Each encoding module controlled by the decision module is numbered 0 to 6, and the associated output decision is indexed by the same number, preceded by “DS”. The arrows referenced 14 and 15 represent secondary pieces of information provided to the different encoding modules controlled by the decision module 10.
For a sequence of images to be encoded, each source image portion (current image portion to be encoded, in a current image to be encoded) passes through all the encoding modules represented. The source samples s are first of all predicted by means of a prediction circuit 600 (comprising in this example an intra prediction module 4, an inter prediction module 5 and a selection module 6). The result of the prediction is a set of predicted modules p, which is subtracted from the source modules (by a subtraction module 16) leading to residual samples r. The residual samples are successively transformed (by a transformation module 0), quantized (by a quantization module 1), encoded by means of an entropy encoding (by an entropy encoding module 2) then transmitted to the decoder. The decoding loop is reproduced at the encoder in order to reconstruct the encoded samples, available to the decoder, and in order to use them as a reference for the prediction. Thus, the quantized converted residual samples {tilde over (c)} (at output of the quantification module 1) are successively de-quantized (by an inverse quantization module 7), de-transformed (by an inverse transformation module 8), summated with the predicted samples (by an addition module 13) and if necessary post-filtered (by a loop filtering module 3). The resulting reconstructed modules ŝ are then stored in the memory (buffer memory 9) to be used during the prediction step.
During the compression process, the decision module 10 takes a decision for at least one of the encoding modules 0 to 6 passed through. The taking of a decision consists of the selection of at least one encoding parameter (also called an “encoding mode”) from amongst a set of encoding parameters available for the encoding module considered. In the case of the transformation module 0, the decision module can be used to select for example the size or the type of transform applied to the residual samples; in the case of the quantization module 1, to select for example the quantization parameters (for example quantization pitch, threshold, dead zone, matrices, etc.); in the case of the entropy encoding module 2, to select the type of entropy encoding; in the case of the loop filtration module 3, to select for example the loop filtering parameters (for example, the strength of the deblocking filter, offsets, classification in the “sample adaptive offset (HEVC/H.265)” filter, etc.); in the case of the intra prediction module 4, to select for example the intra prediction mode (for example “DC”, “planar”, different modules of angular prediction, etc.); in the case of the inter prediction 5 module, to select for example the motion vectors and/or motion predictors; in the case of the selection module 6, to select for example the prediction mode (for example intra/inter prediction mode) and/or the size of the prediction unit.
In short, for each block (or image portion) of each image of a video sequence, numerous decisions (selections of encoding parameters) must be made by the video encoder in order to optimize the efficiency of encoding of this video sequence. The decision model (also called a decision model) used to optimize these decisions (selections of encoding parameters) is left free for the encoder (not standardized).
A prior-art decision model, used by the decision module 10 and implemented in the majority video encoders is based on the optimizing of the rate-distortion trade-off. The goal of such a decision model is to minimize the distortion D under constraint of a rate RT, where the distortion is a measurement of distance between the source video samples, s, and the reconstructed video samples, ŝ, for example:MIN(D(s,ŝ)), under the constraint R=RT 
Various known methods are used to resolve this classic problem of optimizing under constraint. For example, if the Lagrange multiplier method is used, this classic problem of optimization under constraint can be rewritten in the form of a single minimization, such that:
for each block (or image portion) of each image, for at least one encoding module and one set of encoding parameters, M={mi},iε{0,K−1}, K≧2, associated with this encoding module, a search is made for the optimal encoding parameter mopt that minimizes:
                    {                                                                              m                  opt                                =                                                      argmin                                          m                      ∈                      M                                                        ⁡                                      (                                          J                      m                                        )                                                                                                                                            J                  m                                =                                                      D                    ⁡                                          (                                              s                        ,                                                                              s                            ^                                                    m                                                                    )                                                        +                                      λ                    ×                                          R                      ⁡                                              (                                                                                                            c                              ~                                                        m                                                    ,                                                      ρ                            m                                                                          )                                                                                                                                                    (        1        )            
Where:
Jm is the Lagrangian cost for the encoding parameter m;
D(s,ŝm) is a measurement of distortion between the source samples, s, and the reconstructed samples, ŝm, for the encoding parameter m;
R({tilde over (c)}m,ρm) is an estimation of the cost in bits of the quantized transformed residues, {tilde over (c)}m, and of the secondary information, ρm, to be transmitted to the decoder for the encoding parameter m;
λ is the Lagrange multiplier defining a trade-off between distortion and rate.
It will be understood from the equation (1) that for a target rate R and an encoding parameter m, and whatever the type of measurement of distortion D used (in particular any metric whatsoever with perceptual properties, for example based on a measurement of similarity of the structures of the image), the process of optimization orients the decisions so as to remain as close as possible to the source samples only. Perceptually, for most of the video contents, because of the quantization step (i.e. step with loss of data), this optimizing criterion based on a minimizing of the distance to the characteristics of the source without an additional constraint is insufficient.
Indeed, for source blocks belonging to image zones with similar characteristics, i.e. zones having a high spatial correlation and/or temporal correlation, the process of compression based on the preceding decision model can lead to reconstructed blocks with different characteristics and with a significant loss of correlation between the spatially and/or temporally neighboring reconstructed blocks. The samples are thus reconstructed in an irregular or unsmooth manner, spatially and/or temporally. The phenomenon is visually very unpleasant; in particular, temporally, through several successive images, where flickering artifacts or other artifacts of loss of temporal consistency appear. The combination of the successive choice of different encoding parameters (with different properties) and of the loss of residual information resulting from the quantization explains this phenomenon.
The prevention of this visual artifact has motivated the present invention described here below.