The standards of compression based upon DCT include, by way of example, the standards MPEG-1, MPEG-2, MPEG-4 and H.263. The MPEG-4 standard contains an extended set of data-compression procedures. To achieve different goals, the procedures are divided into sets with progressively increasing capacities, referred to as “profiles”. Within each profile, the admissible range of parameters is governed through “levels”. Hence, an implementation of a certain given coding apparatus is specified through a given profile and a given level.
The MPEG-4 standard enables reduction of the redundancy in the image of a sequence, transforming the data in an appropriate way. Application of an orthonormal transform enables concentration of the energy of the signal in the low-frequency coefficients, the information associated to which will be appropriately reduced to adapt to the channel.
The correlation of the image is present between pixels that are locally adjacent in the image and between frames that are close to one another in time. For this reason, the MPEG-4 standard enables three different kinds of frames, designated by the letters I, P, and B. The frames I (Intra) are coded without any temporal reference. They require a large number of bits, but enable reconstruction of the quality of the image and have a random access to the sequence. The reconstructed images of type I (anchor frame) are used as references for reconstruction of the images that follow in the sequence.
The frames P (Inter) are coded using the temporal correlation with the preceding frames of type I or P. The coding device estimates the movement that has occurred between two frames and sends corresponding motion vectors to the decoding device. The residual information, given by the difference between the original image and the information shown obtained from the anchor frame, contains the data that cannot be estimated through motion vectors.
The frames of type B (two-directional frames) are coded using the temporal correlation with the preceding and following frames of type I or P. Since two possible references are available, the coding device can choose the direction that is less costly in terms of dimensions of the compressed data. Since both the preceding images and the following images are used, the transmission and the temporal order must be modified so as to have all the information necessary for reconstructing the two-directional predicted frames. The profile referred to as “simple” used by the MPEG-4 standard makes use of only frames of type I and P.
Potentially, coding of the predicted frames (whether P frames or B frames) can cause a mismatch between the coding device and the decoding device. To code a frame of such a type, the coding device must in fact store the preceding frame to be used for the temporal prediction. Consequently, both the coding device and the decoding device reconstruct the current frame, which is used as reference for the future images.
Theoretically, the coding device and the decoding device would require having available exactly the same set of reconstructed data to reconstruct correctly the decoded image and thus prevent occurrence of the so-called mismatch error between the two images reconstructed at the encoder end and at the decoder end. In practice, minimal differences between the two images are, however, acceptable.
These differences are due to the particular discrete-cosine-transform procedure used. In fact, the goal of standards, and in particular of the MPEG-4 standard, is to allow the developers and implementers of the circuits the highest possible degree of freedom in implementing of the procedure. Since many DCT algorithms have been developed in the past, different implementers can use their own approaches to get an edge over the competitors, using the most innovative algorithms. However, since each DCT algorithm intrinsically represents a method of approximation of the cosine transform, different approximations generate slightly different results. Standards of the MPEG-4 type define the maximum amount of differences allowed.
However, since, as explained previously, the images are predicted on the basis of the preceding ones, the difference tends to increase in time. The MPEG-4 standard defines statistically the maximum amount of variation between two images.
When the coding device and the decoding device operate on different preceding data, because they use different DCT algorithms, the so-called mismatch error thus occurs. As stated previously, the difference between the coding device and the decoding device may be due only to the modules that execute the discrete cosine transform (DCT) and the inverse discrete cosine transform (IDCT). In fact, it is the result on the individual block that can vary between different codings.
The MPEG-4 standard provides only limits of tolerated error, as may be seen from Annex A of the ISO/IEC 14496-2 recommendation, “Coding of audio—visual objects—Part 2: Visual”, pp. 253-254, Third edition: 2003.
In particular, the mismatch occurs when the coding device and the decoding device obtain different outputs from the IDCT block. Two different cases may arise: the IDCT of the coding device supplies a zero block, i.e., a block identified by zero coefficients, from a block of the input data flow that originally was non-zero, while the decoding device supplies a non-zero block; or else the IDCT of the coding device supplies a non-zero block, while the decoding device supplies a zero block.
An example of a similar problem can be obtained from an examination of FIG. 1, which represents a working diagram of a coding and decoding method that involves a compression via DCT. In FIG. 1, the reference number 100 consequently designates a coding device, which comprises a block 120 representing a DCT operation and a block 130 representing a quantization operation with a certain quantization step QP. The blocks 120 and 130 receive at input a noncompressed-data flow I, divided into blocks B, and return compressed blocks DB at output, which, via coding operations not represented in detail in FIG. 1, involve, for example, the use of Huffman tables, and are then coded in an output compressed-data flow O. CB designates a current block, i.e., the block that is processed at a given instant by the coding device 100, and that, in the case shown, is assumed as containing non-zero coefficients.
Present in the coding device 100 is a branch that fetches the compressed-data flow O and executes a set of inverse operations, i.e., an inverse quantization operation in a block 190 and an IDCT operation in a block 195, to obtain reconstructed blocks RB to be used in the reconstruction of the frames.
The reference 100′ thus designates a decoding device, which receives at input the compressed-data flow O and carries out thereon an inverse-quantization operation represented by a block 190′ and an IDCT operation in a block 195′, to supply reconstructed blocks RB′ in a decompressed-data flow I′.
FIG. 1 shows the case where the block RB reconstructed by the coding device 100 and the block RB′ reconstructed by the decoding device 100′ are different on account of implementation of different procedures at the coding device 100 and at the decoding device 100′. In particular, the block RB reconstructed by the coding device 100 is a zero block, i.e., all its coefficients are zero.
The MPEG-4 standard strongly recommends avoidance of such a case, i.e., the case where the IDCT of the coding device supplies a zero block, while the output of the coding device itself supplies a non-zero block (section “Mismatch control” (7.4.4.5) of the ISO/IEC 14496-2 recommendation, “Coding of audio—visual objects—Part 2: Visual”, pp. 253-254, Third edition: 2003). Normally, the effects are not visible, since the margins defined are sufficient to guarantee negligible differences between the frames reconstructed at the coding-device end. However, in particular situations, the results of the mismatch may be very evident.
For example, when a still image is coded, prediction data with small errors are sent to the decoding device. These data can be reconstructed as zero data by the IDCT in the coding device, whereas they are reconstructed as non-zero data by another type of IDCT implementation at the decoding device. When the same data are present at the coding device in the next image, the same mismatch occurs between the coding device and the decoding device. The errors accumulate in the same points, and the reconstructed images soon diverge.
Consequently, even though the standards fix quantitative limits of mismatch error, from the state of the art no approaches are known for attenuating or eliminating the mismatch error.