The standard H.263 for very low bitrate video coding, described for instance in "ITU standardisation of very low bitrate video coding algorithms", K. Rijkse, Signal Processing: Image Communication, 7(1995), pp.553-565, is based on a hybrid video coding method dealing with macroblock structured pictures and using techniques such as DCT (Discrete Cosine Transform), to reduce the spatial redundancy, motion estimation and interpicture prediction, to reduce spatial redundancy, and finally quantization variable length entropy coding (as also provided in the case of the MPEG-2 standard).
The maximum bitrate for this standard H.263 is about 20 kbits/s for videophone and an integer multiple of 64 kbits/s (such as 64, 128, 256, . . . ) for video conference. At these very low bitrates, various kinds of solution are often used in order to reduce the transmitted bitrate, for instance a temporal sub-sampling. These solutions must however be implemented without degrading the picture quality.
A block diagram of the standard H.263 encoder is shown in FIG. 1. The input bitstream IB corresponding to the images to be coded is received by the first positive input of a subtracter 11. This subtracter is followed in series by an orthogonal transform device such as a DCT circuit 12, a quantizer 13 (Q), a variable length coding (VLC) circuit 14, a video multiplexer 15 (MUX), and an output buffer 16 that yields an output bitstream OB. An interpicture prediction loop, provided between the output of the quantizer 13 and the second negative input of the subtracter 11 comprises in series an inverse quantizer 17 (Q.sup.-1), an inverse DCT circuit 18 (DCT.sup.-1), an adder 19, a prediction circuit 20, the output of which is also sent back to the second input of the adder 19 for the reconstitution of a complete image at the output of said adder, and the subtracter 11.
The output of the adder 19 is sent to a motion estimator 21 that also receives the input bitstream IB and yields motion vectors MV. These vectors are then coded by a second VLC circuit 22 and sent to the multiplexer 15 for transmission (or storage). A decision circuit 23 provided between the output buffer 16 and the prediction circuit 20 allows to choose between an intra coding mode, concerning only the first picture of the video sequence, which is then coded without temporal prediction, and an inter coding mode, according to which all the remaining pictures are coded using prediction.
As the intra pictures are coded without any reference to any previous picture, each of them needs from 4 to 10 times (depending on the scene content and on the average quantization parameter) the amounts of bits necessary to code the subsequent pictures in inter mode. The following table (=Table 1) illustrates, for some well known test sequences in CIF format (288 lines of 352 pixels), the difference in terms of bits between intra and inter modes:
CIF sequences intra mode inter mode Miss America 35568 3936 Claire 37224 3496 Renata 149984 34736 Flower Garden 180456 63512 Foreman 67736 13016 Teeny 67344 38968 Interview 106320 11272
The values of these amounts of bits necessary to code the first picture in intra mode and the second subsequent picture in inter mode lead to observe that the output buffer 16, necessary to transmit the output stream OB at constant bitrate, is strongly used during the intra coding. A buffer with a proper capacity might be used in order to store an intra picture without any risk of overflow, but the delay of the encoder is directly proportional to the total bit number of the first intra picture: the larger the number of bits of this picture, the larger the delay to empty the output buffer at the concerned constant target bitrate.