The invention relates to a method of image compression in which the images are coded according to groups of variable lengths.
It relates more particularly to a method of the MPEG type, particularly of MPEG2 type. Although the invention is not limited to this standard, it will be referred to primarily in the remainder of the description.
The principle of such compression is reiterated below.
In the video MPEG2 standard, compression of the digital video signals is obtained by exploiting the spatial redundancy and the temporal redundancy of the coded images.
The spatial redundancy is evaluated principally by virtue of a succession of three operations: an operation commonly called discrete cosine transform and denoted DCT (xe2x80x9cDiscrete Cosine Transformxe2x80x9d), an operation of quantisation of the coefficients arising from the DCT and an operation of variable-length coding to describe the quantified coefficients arising from the DCT.
The temporal redundancy is analysed by a movement-compensation operation which consists, by translation of each block of the current image, in searching for the most similar block situated in the reference image. The analysis of the temporal redundancy leads to a field of translation vectors being determined, commonly called movement vectors, as well as a prediction error which is the difference between the signal of the current image and the signal of the image predicted by movement compensation. The prediction error is then analysed according to the principle of spatial redundancy.
MPEG coding is of predictive type. It follows that the decoding which is associated with it should be regularly reinitialised so as to protect the signal against any transmission error or any break in signal due to the decoder being switched over from one programme to another.
To this end, the MPEG2 standard provides that, periodically, the images should be coded in spatial mode, that is to say according to a mode exploiting only spatial redundancy. The images coded in spatial mode are called INTRA images or I images.
The images coded by exploiting temporal redundancy are of two types: on the one hand, the images constructed by reference to a temporally previous image on the basis of a front prediction and, on the other hand, the images constructed by reference to two temporally previous and subsequent images on the basis of a front prediction and of a back prediction.
The coded images constructed on the basis of a front prediction are called predicted images or P images and the coded images constructed on the basis of a front and of a back prediction are called bidirectional images or B images.
An I image is decoded without reference being made to images other than itself. A P image is decoded by referring to the P or I image which precedes it. A B image is decoded by relying on the I or B image which precedes it and on the I or P image which follows it.
The periodicity of the I images defines a group of images widely denoted GOP (xe2x80x9cGroup Of Picturesxe2x80x9d).
Within a single GOP, the quantity of data contained in an I image is generally greater than the quantity of data contained in a P image and the quantity of data contained in the P image is generally greater than the quantity of data contained in a P image.
At 50 Hertz, the GOP is presented as an I image followed by a sequence of B and P images which, most of the time, exhibits the following sequence
I, B, B, P, B, B, P, B, B, P, B, B.
However, the standard does not demand N=12 images being provided in a GOP, as is the general case, nor that the distances M between two P images should always be equal to 3. More precisely, the distance M is the number n of B images preceding or following a P image, increased by one unit, i.e. M=N+1.
The number N represents the size or length of the GOP, while the number M represents its structure.
The invention results from the observation that it is possible to act on the M and N parameters to enhance the level of compression and/or enhance the quality of the coding.
The method of coding according to the invention is characterised in that at least one parameter is determined characterising the source images which are to be coded according to a group and in that the length and the structure of the group is made to depend on this parameter or these parameters.
In one embodiment, the parameter(s) characterising the source images is or are determined with the aid of a test coding in the course of which defined values are allocated to N, M and to the quantisation interval Q.
The test coding is carried out, for example, in open loop.
In one particularly simple embodiment, a parameter (Pcost) characterising the P images obtained during the test coding and a parameter (Bcost) characterising the B images obtained during the test coding are determined separately, these parameters characterising the P and B images being, preferably, the average costs of coding of the P and B images. The cost of coding an image is the number of bits (headers included) which is necessary for the coding.
In this case, the number N can be made to depend on the parameter characterising the P images and the number M on the parameter characterising the B images.
During trials carried out in the context of the invention, on sequences of images of various types, it was noted that, for each type of sequence, an optimal number N existed providing a minimum coding cost (or throughput) for the P images and an optimal number M providing a minimum coding cost (or throughput) for the B images, these costs being obtained during the test coding. These sequences are distinguished from another by movement of variable amplitudes, different objects, different spatial definitions and different contents.
It was noted, moreover, that a practically linear relationship exists between the optimal number N and the throughput of the P images. Likewise, a practically linear relationship exists between the number M and the throughput of the B images. Hence, knowing the throughputs of the P and B images, it is easy to calculate the numbers N and M providing the best results.
In an example corresponding to the MPEG2 standard, 50 Hz, the test coding is carried out with N=12, M=3 and Q=15, the relationship between N and the throughput of the P images is approximately as follows:                               N          =                                    INT              ⁡                              [                                                      389000                    -                                          P                      ⁢                                              xe2x80x83                                            ⁢                      cos                      ⁢                                              xe2x80x83                                            ⁢                      t                                                        10000                                ]                                      +            1                          ,                              with            ⁢                          xe2x80x83                        ⁢            12                    ≤          N          ≤          30                                    (        1        )            
and the relationship between M and the throughput, or cost, Bcost of the B images is as follows:                               N          =                                    INT              ⁡                              [                                                      179000                    -                                          B                      ⁢                                              xe2x80x83                                            ⁢                      cos                      ⁢                                              xe2x80x83                                            ⁢                      t                                                        20000                                ]                                      +            1                          ,                              with            ⁢                          xe2x80x83                        ⁢            1                    ≤          M          ≤          7.                                    (        2        )            
It is also possible to limit M to 5.
In these formulae, INT signifies the integer part.
The limitation on N between 12 and 30 and the limitation on M to a maximum value of 7 makes it possible to have a simple embodiment of the coders and to limit the programme-changing time. With the same aim, it is also possible to impose other limitations or constraints, particularly that M be constant in the GOP and/or that it be a sub-multiple of N.
In one embodiment, if the values of N and of M taken individually and together are not compatible with the constraints, the values of M and of N closest to the calculated values and which satisfy the stipulated compatibility will be chosen. In this case, the value of M will be favoured, that is to say that if a choice has to be made between several M, N pairs, the pair will be chosen for which the M value is closest to that which results from the calculation.
The formula (2) above applies only if Bcost does not exceed 179800. In the opposite case, that is to say if Bcost greater than 179000, experiment has shown that it was necessary, in this example, for M to be chosen in the following way:                               M          =                      5.            ⁢                          INT              ⁡                              [                                                                            P                      ⁢                                              xe2x80x83                                            ⁢                      cos                      ⁢                                              xe2x80x83                                            ⁢                      t                                                              B                      ⁢                                              xe2x80x83                                            ⁢                      cos                      ⁢                                              xe2x80x83                                            ⁢                      t                                                        -                  1                                ]                                                    ,                              with            ⁢                          xe2x80x83                        ⁢            1                    ≤          M          ≤          7.                                    (        3        )            
If the cost of a B image is higher than the cost of a P image, it is preferable for the GOP to contain no B image, that is to say M=1. This is because the P images, exhibiting a better prediction quality than the B images and being, by assumption, of lower cost, the presence of such B images would constitute a drawback in this case.
The costs, in bits, of each P image and of each B image are determined, for example, as and when these images appear. In one embodiment, the values of M and N are selected by taking an average over all the P and B images of the test coding, the coding proper being carried out only after the test coding of N source images, N being determined by the cost of coding the P images. In this case, the parameter M may remain constant in the GOP.
In another embodiment, which allows a more rapid adaptation to the variations in content of the scenes as well as a reduction in the delay between the arrival of the source images and coding proper (and which thus allows a lower-capacity buffer memory), the coding proper is started as soon as the test coding supplies data allowing this starting. Hence, the first B image of the test coding provides an M number allowing the coding to be started and the N number is supplied by the first P image of the test coding. It is also possible to have the coding start only after the test coding of the first P image; in this case, the coding starts when a value of N and a value of M are known.
With this type of coding xe2x80x9con the flyxe2x80x9d, the number M, that is to say the structure, may vary within a GOP, which allows a more rapid adaptation to the variations in content of the scene.
In the coding carried out progressively, the GOP is interrupted when the number of images already coded in the current GOP is at least equal to the measured number N (measured by Pcost in the above example), or upon a change of scene.
In order to avoid significant variations in the parameters between groups which follow one another, it may prove to be worthwhile to depart from the calculated values. For example, if the calculation shows that, for a large part of the length of the GOP, for example at least 80%, M=1 would be necessary, whereas, for the rest of the GOP the calculation shows that M should be greater than 1, the value 1 will be adopted for M, despite everything, even if the calculation shows that a different value is necessary.
Likewise, if for the preceding GOP, M=1 and if, for the current GOP, the calculation shows that a value M=1 would be necessary for a significant part of the current GOP, for example at least 60%, the value 1 will also be adopted for M, even if the result of the calculation, as it results from the formula (2) above, implies a different value.
It is known that when a change of scene occurs, that is to say when a discontinuity appears in the sequence of video images, it is necessary to adapt the GOP image groups on either side of the discontinuity so that the new group, which starts with an I image, corresponds to the new scene.
In one embodiment, if the change of scene occurs in a group, the new scene constitutes the I image of a new group, the affected group being shortened so as to stop before this new scene if the change of scene occurs in the affected group, at a distance from the start at least equal to the minimum number allowable for N. The start of the affected group is used to lengthen the group which precedes it when the sum of the number of images preceding the change of scene in the affected group and of the number of images of the group which precedes it does not exceed the maximum admissible for N. In this preceding group thus modified (shortened or lengthened), it may be necessary to modify the number M previously calculated for this GOP.
In one variant, which is used for preference in the case in which the length of the affected group is less than the minimum allowable for N, when a change of scene occurs in a group, the new scene constitutes the I image of a new group, this new group having a length equal to the average of the length of the group before it was affected and the length of the group which precedes it. With this variant, it may be necessary to modify the number M previously calibrated for the GOPs.
When two modifications are possible, for example when the length of the affected group is less than the minimum allowable for N, a choice may be made between these two modifications by carrying out a calculation, for each modification, of the distance of the (M, N) pair obtained or M, N pair before modification and by selecting the pair for which the distance is the smallest.
In order to determine the parameters N and M, recourse may be had to the measurement of parameters other than the measurement of the throughputs. For example, used may be made, in order to determine N, of the energy of the I Intra images. It is also possible to determine the amplitude of the movements or the movement compensation error, known as DFD (Displaced Frame Difference) for determining M and N.