In the current video standards (up to the video coding MPEG-4 standard and H.264 recommendation), the video, described in terms of one luminance channel and two chrominance ones, can be compressed thanks to two coding modes applied to each channel: the “intra” mode, exploiting in a given channel the spatial redundancy of the pixels (picture elements) within each image, and the “inter” mode, exploiting the temporal redundancy between separate images (or frames). The inter mode, relying on a motion compensation operation, allows to describe an image from one (or more) previously decoded image(s) by encoding the motion of the pixels from one (or more) image(s) to another one. Usually, the current image to be coded is partitioned into independent blocks (for instance, of size 8×8 or 16×16 pixels in MPEG-4, or of size 4×4, 4×8, 8×4, 8×8, 8×16, 16×8 and 16×16 in H.264), each of them being assigned a motion vector (the three channels share such a motion description). A prediction of said image can then be constructed by displacing pixel blocks from a reference image according to the set of motion vectors associated to each block. Finally, the difference, or residual signal, between the current image to be encoded and its motion-compensated prediction can be encoded in the intra mode (with 8×8 discrete cosine transforms—or DCTs—for MPEG-4, or 4×4 DCTs for H.264 in the main level profile).
The DCT is probably the most widely used transform, because it offers a good compression efficiency in a wide variety of coding situations, especially at medium and high bitrates. However, at low bitrates, the hybrid motion compensated DCT structure may be not able to deliver an artefact-free sequence for two reasons. First, the structure of the motion-compensated inter prediction grid becomes visible, with blocking artifacts. Moreover, the block edges of the DCT basis functions become visible in the image grid, because too few coefficients are quantized—and too coarsely—to make up for these blocking artifacts and to reconstruct smooth objects in the image.
The document “Very low bit-rate video coding based on matching pursuits”, R. Neff and A. Zakhor, IEEE Transactions on Circuits and Systems for Video Technology, vol. 7, no. 1, February 1997, pp. 158-171, describes a new motion-compensated system including a video compression algorithm based on the so-called matching pursuit (MP) algorithm, a technique developed about ten years ago (see the document “Matching pursuits with time-frequency dictionaries”, S. G. Mallat and Z. Zhang, IEEE Transactions on Signal Processing, vol. 41, no. 12, December 1993, pp. 3397-3414). Said technique provides a way to iteratively decompose any function or signal (for example, image, video, . . . ) into a linear expansion of waveforms belonging to a redundant dictionary of basis functions, well localized both in time and frequency and called atoms. A general family of time-frequency atoms can be created by scaling, translating and modulating a single function g(t)εL2(R) supposed to be real and continuously differentiable. These dictionary functions may be designated by:gγ(t)εG(G=dictionary set),  (1)γ(=gamma) being an indexing parameter associated to each particular dictionary element (or atom). As described in the first cited document, assuming that the functions gγ(t) have unit norm, i.e. <gγ(t), gγ(t)>=1, the decomposition of a one-dimensional time signal f(t) begins by choosing γ to maximize the absolute value of the following inner product:p=<f(t), gγ(t)>,  (2)where p is called an expansion coefficient for the signal f(t) onto the dictionary function gγ(t). A residual signal R is then computed:R(t)=f(t)−p.gγ(t)  (3)and this residual signal is expanded in the same way as the original signal f(t). An atom is, in fact, the name given to each pair γk, pk, where k is the rank of the iteration in the matching pursuit procedure. After a total of M stages of this iterative procedure (where each stage n yields a dictionary structure specified by γn, an expansion coefficient pn and a residual Rn which is passed on to the next stage), the original signal f(t) can be approximated by a signal {circumflex over (f)}(t) which is a linear combination of the dictionary elements thus obtained. The iterative procedure is stopped when a predefined condition is met, for example either a set number of expansion coefficients is generated or some energy threshold for the residual is reached.
In the first document mentioned above, describing a system based on said MP algorithm and which performs better than the DCT ones at low bitrates, original images are first motion-compensated, using a tool called overlapped block-motion compensation which avoids or reduces blocking artifacts by blending the boundaries of predicted/displaced blocks (the edges of the blocks are therefore smoothed and the block grid is less visible). After the motion prediction image is formed, it is subtracted from the original one, in order to produce the motion residual. Said residual is then coded, using the MP algorithm extended to the discrete two-dimensional (2D) domain, with a proper choice of a basis dictionary (said dictionary consists of an overcomplete collection of 2D separable Gabor functions g, shown in FIG. 1).
A residual signal f is then reconstructed by means of a linear combination of M dictionary elements:
                              f          ^                =                              ∑                          n              =              1                                      n              =              M                                ⁢                                                    p                ^                            n                        ·                          g                              γ                n                                                                        (        4        )            If the dictionary basis functions have unit norm, {circumflex over (p)}n is the quantized inner product <, > between the basis function g□n and the residual updated iteratively, that is to say:
                                          p            n                    =                      <                          f              -                                                ∑                                      k                    =                    1                                                        k                    =                                          n                      -                      1                                                                      ⁢                                                                            p                      ^                                        k                                    ·                                      g                                          γ                      k                                                                                                          ,                              g                          γ              n                                >                                    (        5        )            the pairs ({circumflex over (p)}n, γn) being the atoms. In the work described by the authors of the document, no restriction is placed on the possible location of an atom in an image (see FIG. 2). The 2D Gabor functions forming the dictionary set are defined in terms of a prototype Gaussian window:w(t)={square root over (2.)}e−nt2  (6)A mono-dimensional (1D) discrete Gabor function is defined as a scaled, modulated Gaussian window:
                                                        g                              α                →                                      ⁡                          (              i              )                                =                                    K                              α                →                                      .                          w              (                                                i                  -                                      N                    2                                    +                  1                                s                            )                        .                          cos              (                                                                    2                    ⁢                                          πξ                      ⁡                                              (                                                  i                          -                                                      N                            2                                                    +                          1                                                )                                                                              N                                +                ϕ                            )                                      ⁢                                  ⁢                  with          ⁢                      :                    ⁢                                          ⁢          i          ⁢                                          ⁢          ε          ⁢                                    {                              0                ,                1                ,                …                ⁢                                                                  ,                                  N                  -                  1                                            }                        .                                              (        7        )            The constant K{right arrow over (α)} is chosen so that g{right arrow over (α)}(i) is of unit norm, and {right arrow over (α)}=(s, ξ, φ) is a triple consisting, respectively, of a positive scale, a modulation frequency, and a phase shift. If S is the set of all such triples {right arrow over (α)}, then the 2D separable Gabor functions of the dictionary have the following form:G{right arrow over (α)},{right arrow over (β)}(i,j)=g{right arrow over (α)}(i)g{right arrow over (β)}(j) for i,jε{0,1, . . . ,N−1}, and {right arrow over (α)},{right arrow over (β)}εS  (8)The set of available dictionary triples and associate sizes (in pixels) indicated in the document as forming the 1D basis set (or dictionary) is shown in the following table 1:
TABLE 1sizekskζkφk(pixels)01.00.00113.00.00525.00.00937.00.001149.00.0015512.00.0021614.00.0023717.00.0029820.00.003591.41.0π/23105.01.0π/291112.01.0π/2211216.01.0π/2271320.01.0π/235144.02.007154.03.007168.03.0013174.04.007184.02.0π/47194.04.0π/47To obtain this parameter set, a training set of motion residual images was decomposed using a dictionary derived from a much larger set of parameter triples. The dictionary elements which were most often matched to the training images were retained in the reduced set. The obtained dictionary was specifically designed so that atoms can freely match the structure of motion residual image when their influence is not confined to the boundaries of the block they lie in (see FIG. 2, showing the example of an atom placed in a block-divided image without block-restrictions).
However, the approach described in the cited document suffers from several limitations. The first one is related to the continuous structure of the Gabor dictionary. Because atoms can be placed at all pixel locations without any restriction and therefore span several motion-compensated blocks, the MP algorithm cannot represent blocking artefacts in the residual signal with a limited number of smooth atoms. It is the reason why it is necessary to have some kind of overlapped motion estimation, in order to limit the blocking artifacts. If a classical block-based motion compensation (i.e. without overlapping windows) is used, the smooth basis functions may not be appropriate to make up for blocking artifacts (indeed, it has been recently showed that coding gains could be made when the size of the residual coding transform is matched to the size of the motion-compensated block). Third, it is difficult to combine intra and inter blocks in a coded frame (in the cited document, no DCT intra macroblock exists, probably in order to avoid discontinuities on the boundaries of blocks coded in intra and inter mode that would be badly modelled by the smooth structure of Gabor basis functions).