The invention relates to a device for pre-processing video images intended to be coded according to the MPEG 2 video standard.
A system for coding according to the MPEG 2 video standard uses the properties of the signal in order to reduce its bit rate.
The coding algorithm employed describes the images in blocks, exploiting the spatial redundancy and the temporal redundancy of the images to be coded.
The spatial redundancy is evaluates, chiefly, through the succession of three operation: an operation commonly termed discrete cosine transform and denoted DCT, an operation quantizing the coefficients from the DCT and a variable-length coding operation for describing the quantized coefficients from the DCT.
The temporal redundancy is analysed by a motion compensation operation which consists in searching, using a translation operation for each block of the current image, for the most similar block situated in a reference image, for the most similar block situated in a reference image. Analysis of the temporary redundancy involves determining a field of translation vectors, commonly termed motion vectors, as well as a prediction error, namely the difference between the signal of the current image and the signal of the image predicted by motion compensation. The prediction error is then analysed according to the principle of spatial redundancy.
MPEG 2 coding is a coding of predictive type. It follows that the decoding procedure associated with it must be regularly re initialized so as to protect the signal form any transmission error or any break in signal due to the toggling of the decoder from one programme to another.
For this purpose, the MPEG 2 standard provides that, periodically, the images must be coded in spatial mode, that is to say according to a mode which exploits spatial redundancy only. The images coded in spatial mode are commonly termed INTRA images or I images.
The images coded by exploiting temporal redundancy are of two types: there are, on the one hand, these images constructed by reference to a temporally earlier image and, on the other hand, these images constructed by reference to a temporally earlier image and to a temporally later image.
The coded images constructed by reference to a temporally earlier image are commonly referred to as predicted images or P images and the coded images constructed by reference to a temporally earlier image and to a temporally later image are commonly referred to as bi-directional images or B images.
An I image is decoded without making reference to images other than itself, A P image is decoded by making reference to the P or I image which precedes it. A B image is decoded by invoking the I or P image which precedes it and the I or P image which follows it.
The periodicity of the I images defines a group of images commonly denoted GOP (the acronym GOP standing for "Group of Pictures").
As is known to those skilled in the art, within a given GOP, the amount of date contained in an I image is generally greater than the amount of date contained in a P image and the amount of data contained in a P image is generally greater than the amount of date contained in a B image.
In order to manage this disparity between the amounts of data depending on the type of image, an MPEG 2 coder comprises a device for servocontrolling the data bit rate.
Such a servocontrol device makes it possible to control the flow of the coded data. It comprises a buffer memory, for storing the coded data, and models the state of the buffer memory dual to a so-called reference decoder. The servocontrol device smoothes the bit rate of the data exiting the buffer memory in such a way that the sum of the data contained in the coder and in the reference decoder is constant.
Thus, depending on the type of image (I, P or B) this involves managing the fact that the I images produce a bit rate greater than the mean bit rate), that the P images produce a bit rate near the mean bit rate and that the B images produce a bit rate less than the mean bit rate (typically equal to 0.1 to 0.5 times the mean bit rate).
According to the prior art, the coding of an I image is performed in two passes. To determine the quantization step required for coding an I image, a proportionality rule is applied as follows: EQU Qsp.times.Nbsp=Qpp.times.Nbpp,
Qsp being the value of the quantization step applied for coding the I image during the second pass, Nbsp being the number of bits provided for coding the I image during the second pass, Qpp being the value of the quantization step applied for coding the I image during the first pass, and Nbpp being the number of bits produced by coding the I image during the first pass.
As regards the coding of the P or B images, the flow control operates according to the assumption of signal stationary. According to this assumption, each P or B image produces, for the same value of quantization step, a number of bits identical to the number of bits produced by the previous image of the same kind (P or B respectively).
In the case where the frame frequency of the signal is 60 Hz the video signal to be coded exhibits redundant frames. The MPEG 2 standard then provides for the possibility of not coding these frames and of transmitting a replication order therefor to the coder. To detect the redundant frames, measurements are performed of the difference in luminance between pixels of successive frames. Such pixel-to-pixel difference measurements do not offer relevant information as regards the kind and the degree of motion contained in the images. It follows that the distribution of the types of images (P or B) in a GOP is generally fixed by the frequency of appearance of images of type P alone. At the very most, the pixel-to-pixel difference measurements allow the detection of a change of scene possibly manifested, in certain cases, through the adjusting of the size of the GOP.
The abovedescribed type of coding of I, P or B images has drawbacks.
The I image of a GOP is the one with the highest cost in terms of amount of information. The buffer memory mentioned earlier must absorb the very considerable bit rate of this image. According to the prior art, in order to avoid the occurrence of a critical situation upon an increase in the entropy of the signal in the very first few images following the coding of an I image, the cost of the I image is limited so as to preclude the buffer memory of the coder from saturating and the buffer memory of the reference decoder drying up. It is thus usual to prevent the buffer memory of the coder from filling to more than 60 to 70%. This results in a limitation on the quality of the I images.
Since the I image of a GOP serves as reference in the coding of all the other images of the GOP, the limitation on the quality of an I image entails a limitation on the quality of all the other images on the GOP.
More generally, when the signal to be coded exhibits substantial modifications, for example upon a change of picture shot, or more generally, upon a sudden variation in the entropy of the signal (the entropy of the signal denotes the amount of intrinsic information which the signal contains), a temporal instability appears in the reproduction of the images. This temporal instability is manifested as a drop in the quality of the images.
Moreover, as is known to those skilled in the art, an image which corresponds to a considerable variation in the entropy of the signal induces a high cost of coding corresponding to that of an I or P image. Irrespective of the above mentioned drop in the quality of the images, the coding system is then compelled to reduce the bit rate of the images following the image corresponding to the considerable increase in entropy.
Furthermore, when the coded video signals are intended to be multiplexed with other signals of the same type, the overall bit rate of the multiplex must be shared between the various signals. This configuration arises, for example, when broadcasting video programmes by satellite. In this case, the bit rate of the multiplex may reach 40 Mb/s thus permitting the transport of several programmes simultaneously (10 programmes at 4 Mb/s each for example).
A video programme emanating from a MPEG 2 type coding at fixed bit rate exhibits, after decoding, a variation in the quality of the image restored. This stems from the variability of the entropy of the video signal over time, this variability being manifested as a fluctuation in the quantization level of the DCT coefficients.
A suitable allocation of the bit rates associated with the video programmes then allows an overall enhancement in the quality of all the video programmes and/or an increase in the number of programmes broadcast. According to the prior art, the result of coding the GOP of order k is then used as prediction for the expected difficulty in coding the GOP of order k 1. However, this solution has two drawbacks:
1. Should the contents of the video signal vary widely, the result of coding the GOP of index k may differ appreciably from the result of coding the GOP of index k+1.
2. The GOP of index k+1 may, for optimization reasons, be made to undergo considerable structural variations (the size of the GOP and the distribution of the B and P images within the GOP) with respect to the GOP of index k.