Most applications which require video display work with encoded video data. After decoding, these data are often available in a format which is not compatible with the desired display format or composition format. It is thus necessary, in most cases, to perform a format conversion employing compressed video data, before displaying the corresponding image or performing the image composition. This format conversion is applied to the complete image and generally eats up time and memory space, since it involves successive additions and multiplications for each pixel of the image.
For example, the output format from decoding a binary video data stream to the H 263 standard is of the type 4:2:0, Y U V. The Java software graphics interface libraries (AWT) provide API (standing for Application Program Interface) interfaces for image formats based on the 4:4:4, R G B format. Thus, the use of an “applet” (Java application loaded via the Internet) for such a stream requires that the images in the 4:2:0, Y U V format be converted into images in the 4:4:4, R G B format.
The term image used subsequently shall apply to any type of image, frame, biframe, etc. and regardless of the type of scanning.
The expression decoding domain will refer to anything concerning the reception by the decoder of the coded data and their decoding and the expression display domain will refer to anything concerning the utilizing of the decoded data for their composition and their display. The conversion process consists in switching from the decoding domain to the display domain. The decoding processes customarily utilize a predictive temporal mode in which images are predicted from preceding or succeeding images. This involves for example the MPEG 1, MPEG 2, MPEG 4, H261 or H263 standards. In these standards, an image of P type (predictive) is predicted from a preceding image of I type (intra) or from a preceding image of P type and an image of B type (bi-directional) is predicted from a preceding image of I or P type and from a succeeding image of I or P type.
In one example, in respect of the coding of an image block in an image, the preceding image is reconstructed and a motion estimation is performed to determine, in this reconstructed image, the block best correlating with the image block to be coded. The reconstructed image is then motion compensated employing the motion vector corresponding to this estimation so as to provide the predicted block. The predicted block is subtracted from the current block to provide a block called residue, which is coded and transmitted.
The decoding process consists in calculating the predicted blocks by reconstructing the preceding images and in adding thereto the blocks of residue transmitted from the current image.
In the case of images of B or P type, the blocks are predicted from the preceding reference image and, for the B type, also from the succeeding reference image. These reference images are reconstructed at the decoder level and the predicted block is calculated from these images and the motion vectors transmitted in the data stream. The residue block transmitted in the data stream is decoded then added to the predicted block defined by the associated motion vector so as to provide the reconstituted image block in the image.
FIG. 1 very schematically represents the data decoding and conversion process.
The video data pertaining to the reference images are received on a time prediction circuit 1 so as to provide an adder 3 with the predicted images. The video data pertaining to the current image are received on a decoding circuit 2 so as to provide the adder 3 with decoded images. The data output by the adder 3, which correspond to the reconstituted image, are transmitted to a format conversion circuit 4 which converts the images so as to transmit them to a display or to an image composition circuit.
The structure to which the various data compression operations are applied, in the MPEG standard, is the macroblock. The pixels are grouped into image blocks, for example 16×16 pixels in size, four luminance blocks and the corresponding chrominance blocks constituting the macroblock. If the image format, during coding, is 4:2:0, Y, Cr, Cb, the macroblock consists of four luminance blocks and two chrominance blocks. In the predictive temporal mode, each macroblock has its own decision mode. As stated otherwise, the coding mode is decided for each macroblock. It may involve a coding of intra type for which no prediction is used, of predictive type utilizing a backward, forward (as it is known in the standard) or bi-directional motion vector. A macroblock of an image of P type can be coded in intra mode while the succeeding macroblock can be coded in inter mode using motion compensation employing a reference image.
Other modes of compression, which are not necessarily standardised, are based on calculations pertaining to pixel groups which are not image blocks as they are described in the MPEG standard. The prediction modes may be based on regions obtained by segmenting the image according to homogeneity criteria.
The invention applies to these pixel groups, also referred to as gop hereinbelow. This may therefore involve macroblocks or image blocks or else small complex structures such as connected regions. The coding decision mode is independent for each gop, which may be coded independently or by employing preceding and/or succeeding images.
An aim of the proposed invention is to alleviate the drawbacks described previously.