1. Field of the Invention
The invention relates to a method and an apparatus for video coding, in which a predicted picture is produced by averaging reference pictures.
2. Description of the Related Art
Digital video data are generally compressed for storage or transmission, to significantly reduce the enormous volume of data. The compression is effected both by eliminating the signal redundancy contained in the video data and by eliminating the irrelevant signal parts that are imperceptible to the human eye. This is generally achieved by a hybrid coding method in which the picture to be coded is firstly predicted temporally and the remaining prediction error is subsequently transformed into the frequency domain, for example by a discrete cosine transformation, and is quantized there and coded by a variable length code. The motion information and the quantized spectral coefficients are finally transmitted.
The performance of the hybrid coding method depends very significantly on the quality of the temporal prediction. The better this prediction of the next picture information to be transmitted, the smaller the prediction error that remains after the prediction, and the lower the data rate that has to be subsequently expended for the coding of this error. A significant task in the compression of video data is obtaining the most exact prediction possible of the picture to be coded from the picture information that has already been transmitted beforehand.
The prediction of a picture has been effected heretofore by firstly dividing the picture into regular portions, typically square blocks with a size of 8×8 or 16×16 pixels, and subsequently determining, for each of these picture blocks, a prediction from the picture information already known in the receiver, by motion compensation. In this context, it is possible to distinguish between two basic cases of prediction:                unidirectional prediction: in this case, the motion compensation is effected exclusively on the basis of the previously transmitted picture and leads to so-called “P frames”.        bi-directional prediction: the picture is predicted by superposition of two pictures, of which one is temporally preceding and another is temporally succeeding and which leads to so-called “B frames”. It should be taken into consideration here that both reference pictures have already been transmitted.        
In the last-mentioned case, the actual prediction value must be calculated by averaging from both reference pictures. In the previously standardized method MPEG-1 to MPEG-4 or H.263, an equal-weight averaging is always carried out for this purpose, that is to say that the two possible predictions are added and the resulting sum is then halved.