The invention is based on a process for the motion-compensated prediction of moving image sequences using motion vectors which, for each image block of a current image, indicate the position of the image block used for the prediction in relation to an already transmitted reference image.
EP 0 558 922 B2 has disclosed a process for improving motion estimation in image sequences in half pel precision in accordance with the full search process. In this document, in a first process step, the search range is filtered and in a second process step, the match block is filtered with the aid of an additional digital filter which permits a raster shifting of the pixel raster by xc2xc pel. This measure prevents a distortion of the motion vector field.
xe2x80x9cMPEG-4 Video Verification Model Version 8.0xe2x80x9d, Stockholm, July 1997, MPEG 97/N1796 in ISO/IEC JTC1/SC 29/WG11 specifies an encoder and decoder for object-based encoding of moving image sequences. In an accompanying Working Draft Version 4.0, a decoder is specified under MPEG 97/N1797. In a video session (VS), rectangular images of a fixed size are no longer are encoded and transmitted to the receiver, instead so-called VIDEO OBJECTS (VO) are encoded and transmitted, which are permitted to have an arbitrary shape and size. These video objects can then be divided further into different video object layers (VOL) in order, for example, to represent different resolution stages of a video object. The image data of a particular layer in the camera image plane at a particular time is referred to as a VIDEO OBJECT PLANE (VOP). Consequently, the relation between VO and VOP is equivalent to the relation between the image sequence and image in the case of the transmission of rectangular images of a fixed size.
The motion-compensated prediction in the verification model is carried out with the aid of so-called block motion vectors which, for each 8xc3x978 or 16xc3x9716-sized block of pixels of the current image, indicate the position of the block used for the prediction in an already transmitted reference image. The amplitude resolution of the motion vectors is thereby limited to a half pixel, wherein pixels between those of the scanning raster (half pixel position) are generated from the pixels on the scanning raster (integer pixel position) by means of a bilinear interpolation filtration (FIG. 1). In this connection, the + symbol indicates the integer pixel position and O indicates the half pixel position. The interpolated values a, b, c, and d in the half pixel position are produced by the following relations:
a=A, b=(A+B)//2, c=(A+C)//2,
d=(A+B+C+D)//4, wherein // indicates a rounded integer division.
DE-197 30 305.6 has proposed a process for generating an improved image signal with an improved quality of the prediction signal and consequently of the encoding efficiency. In order to generate pixels between those of the pixel scanning raster, a larger local area is taken into consideration than in bilinear interpolation. The aliasing-reducing interpolation filtration results in an increased resolution of the motion vector and thereby to a prediction gain and an increased encoding efficiency. The FIR filter coefficients in this instance can be adapted to the signals to be encoded and can be separately transmitted for each video object, which permits an additional efficiency increase for the encoding and increases the flexibility of the process. In contrast to the embodiment according to EP 0 558 922 B1, no additional poly-phasic filter structures have to be designed for intermediary positions with xc2xc pel pixel resolution in the horizontal and vertical direction.
It is also possible in this instance that with a constant data rate, the image sequence frequency of an MPEG-1 encoder can be doubled from 25 Hz to 50 Hz. In an MPEG-2 encoder, the data rate can be reduced by up to 30% while the image quality remains constant.
According to the invention the process for motion-compensated prediction of moving image sequences uses motion vectors, which, for each image block of a current image, indicate the position of the image block used for the prediction in comparison to an already transmitted reference image,
wherein an aliasing-reducing interpolation filtration with a sub-pel precision is used for determination of motion vectors, wherein more adjacent pixels are accessed for interpolation than in a bilinear interpolation, and
wherein the interpolation filtration is carried out as a function of the position of an intermediary pixel value to be interpolated so that maximally, a block of the reference image containing (M+1)xc3x97(M+1) pixels must be accessed for the prediction of an image block of Mxc3x97M pixels for filtration, or in the prediction of an image block of Mxc3x97M pixels, the pixels that are required for the interpolation filtration and are disposed outside an (M+1)xc3x97(M+1) image block of the reference block are generated by reflecting the pixels disposed inside the reference block to a block edge.
The decisive advantage that the interpolation filtration according to the invention, has in comparison to the previously disclosed processes is the clearly reduced complexity of the implementation of the process, particularly in terms of the memory bandwidth in accessing reference images, which bandwidth is required for motion compensated prediction. Whereas previously, the memory bandwidth required for this was up to 3 times higher than that required in bilinear interpolation at xc2xd pixel amplitude resolution of the motion vectors, in the process according to the invention, the memory bandwidth required here is identical to that of the bilinear interpolation. On the other hand, the advantages of the improved process are retained, namely a more efficient encoding of the moving image sequence according to DE 197 30 305.6.
The difference between the alternative of carrying out the interpolation filtration as a function of an intermediary pixel value to be interpolated so that maximally, a block of the reference image containing (M+1)xc3x97(M+1) pixels must be accessed and the alternative of generating required pixels by reflecting pixels disposed inside the reference block to a block edge lies essentially In that with the option of generating required pixels by reflecting the hardware realization of the process is simpler than in the former option. On the other hand, the former option permits the use of specially developed asymmetrical filters.
FIR filters with N stages are used for the interpolation filtration according to the invention.
The number of pixels of the reference image required for the prediction of an image block is reduced with the process according to the invention. This improves the complexity of the process according to DE 197 30 305.6, in particular it reduces the complexity of the originally used bilinear interpolation filtration at xc2xd pixel amplitude resolution of the motion vectors without significantly impairing the gains that can be achieved by means of the improved process.
The difference between the options of claims 1 and 2 lies essentially in that with the option according to claim 2, the hardware realization of the process is simpler than in the option according to claim 1. On the other hand, the option according to claim 1 permits the use of specially developed asymmetrical filters.
FIR filters with N stages are used for the interpolation filtration according to the invention.
The number of pixels of the reference image required for the prediction of an image block is reduced with the process according to the invention. This improves the complexity of the process according to DE 197 30 305.6, in particular it reduces the complexity of the originally used bilinear interpolation filtration at xc2xd pixel amplitude resolution of the motion vectors without significantly impairing the gains that can be achieved by means of the improved process.