Video decoders or encoders based on predictive block-based encoding techniques, such as MPEG-2 or H.264, for example, are based on a recursive use of motion estimation/compensation in order to reduce the amount of information to be transmitted.
FIG. 1 shows a conventional video decoder according to these encoding techniques. Such a conventional video decoder is described for example in “MPEG video encoding: a basic tutorial introduction”, BBC Research and Development Report, by S. R. Ely 1996/3.
Said video decoder (100) comprises a decoding unit (10) for decoding an encoded data stream ES corresponding to a sequence of encoded pictures. In the MPEG standard, three types of pictures are considered: I (or intra) pictures, encoded without any reference to other pictures, P (or predicted) pictures, encoded with reference to a past picture (I or P), and B (or bidirectionally predicted) pictures, encoded with reference to a past and a future picture (I or P) in a display order. These I and P pictures will be hereinafter referred to as reference pictures. Moreover, each picture of an MPEG sequence is subdivided into motion compensation areas called macroblocks.
The decoding unit according to the prior art includes:                a parser (12), for analysing the encoded data stream,        a macroblock processing unit MBPU (13), for computing motion vectors V(n) and variable length decoded data,        an inverse quantizing and inverse discrete cosine transform IQ/IDCT circuit (15) for delivering a residual error data R′(n) from the variable length decoded data,        a motion compensation circuit MC (14) for delivering motion compensated data using the motion vector V(n),        a reconstruction circuit REC (16) for reconstructing pictures from a sum of motion compensated data and residual error data.        
The known video decoder comprises an external memory EMEM (1) for storing reconstructed pictures delivered by the reconstruction circuit. The pictures to be stored are reference pictures F0 and F1 of the intra or predictive type.
The decoding unit further comprises a memory controller MMI (11) for controlling data exchange between said decoding unit and the external memory via a data bus (2). Said data exchange is, for example, the storage of reference pictures from the reconstruction circuit into the external memory, or the read-out from the external memory of the motion compensated data in a reference picture in order to fetch them to the motion compensation circuit.
A first drawback of the prior art is that the motion compensation is performed on a macroblock basis, so that the motion compensated data are generally read out from different zones of the external memory for successive macroblocks. As a consequence, the data read-out from the external memory is achieved in an irregular manner and a video decoder according to the prior art needs an important memory bandwidth due to the amount of data to be read and to the difficulty of optimizing the access to the external memory with the memory controller. In effect, the data to be read are not necessary aligned in the memory data banks. This drawback is strengthened by the fact that the bandwidth resources do not increase as fast as processor frequency does according to Moore's law.
The following example illustrates this point in the case of an MPEG-2 decoding. Let us assume an external memory organized in words of 64 bits. A word can then contain 8 values (luminance or chrominance) of pixels. The motion compensation circuit has to read areas of at least 16×8 pixels. In MPEG2 standard, the motion compensation has a half-pixel accuracy. As a consequence, the motion compensation unit has to read an area of 17×9 pixels in order to compute the interpolated pixel values. Due to the memory organization in words, the motion compensation circuit reads in fact 3 words of 9 lines or in other words 24×9 bytes, corresponding to a loss of bandwidth of 30% (17×9 corresponds to a bandwidth of approximately 180 Mbytes/s and 24×9 corresponds to a bandwidth of approximately 270 Mbytes/s for a MPEG-2 High Definition HD picture).
Another problem relates to the optimization of the memory controller. This is due to the fact that external memory, such as SDRAM for example, operates in a burst mode, which is not adapted to an irregular read-out of data. Bursts are generated for each lines of the memory. A burst comprises at least 7 or 8 cycles, whereas 3 cycles, in our example, would have been enough to read out the 3 words of a line. As a consequence, the needed bandwidth required for a video decoder according to the prior art is more than twice the bandwidth that would have theoretically been necessary for the decoding process.
Moreover, reference pictures cannot be stored easily in embedded memories instead of the external memory, as said memories are still very expensive. In our example, an embedded memory of 6 Mbytes would be necessary in a high definition HD format, such a memory corresponding to a circuit of approximately 50 mm2 size in a CMOS 0.12 micron technology, which represents a too important circuit surface.