The present invention is related to motion image compression, and in particular to a method, hardware, and software for motion compensated interframe transform coding, e.g., under the constraint of constant delay.
Motion compensated interframe coding is used in common transform coding methods, such as MPEG1, MPEG2, H.261, H.262, H.263, H.264/AVC, (also called MPEG-4 Part 10 and JVT), SMPTE VC-1 (similar to WM9V/Windows Media 9 Video), and so forth. For motion video, consider a sequence of pictures, e.g., video frames. Interframe coding includes dividing each input picture in the sequence into blocks. The following operations are performed for each block: generating a difference picture by subtracting from a newly accepted picture a motion compensated estimate of the previous picture in the sequence, the estimate being generated quantized transform coefficients for the previous picture in what is called herein a prediction loop. The difference picture is transformed to generate transform coefficients, and the transform coefficients are quantized and scaled to form the quantized transform coefficients. Motion vectors are determined for motion estimation, and used in the prediction loop. The quantized transform coefficients and the motion vectors are entropy coded, e.g., variable length coded for delivery.
FIG. 1 shows such a prior-art coder 100. A difference picture is generated by subtracting from an input picture an estimate of the previous picture in the sequence from a prediction loop 111. In block 105, the difference picture is transformed by a DCT or a DCT-like transform, and the transform coefficients are quantized and scaled. It is these quantized coefficients corresponding to the difference picture that are entropy coded by a variable length coder (VLC) for delivery. These quantized coefficients also are the input to the prediction loop 111 where, in block 107, the coefficients are scaled, de-quantized, and inverse transformed to produce a reconstructed difference picture. The reconstructed difference picture is added to the previous estimated picture to produce an estimate of the present picture. In some coders, e.g., coders that conform to the H.264/AVC standard, a de-blocking filter 109 is included in the prediction loop to filter block edges resulting from the prediction and residual difference coding stages of the decoding process. The filtering is applied on block boundaries. The filter coefficients or “strength” of the filter 109 are governed by a content adaptive non-linear filtering method. The de-blocking filtered estimated image is delayed in a frame delay 113 to produce the next previous image estimate. A motion compensated predictor 115 motion compensates the previous picture estimate based on motion vectors determined by a motion estimation block 117. The result is the estimate of the previous picture in the sequence.
We have found that the estimate of the previous picture in the sequence made from the previous quantized and reconstructed picture may contain a significant amount of quantization noise. This is especially true for quantized and reconstructed pictures of high complexity, high motion scenes, with quantization according to relatively large quantization scales. This leads to noise embedded in the difference picture. This noise, which, like the underlying noise-free difference picture, is transformed and coded, results in wasted bandwidth.
There thus is a need in the art for a method and for code and hardware to reduce or eliminate the noise.
Furthermore, in picture sequence coding, e.g., video coding, there is a need to operate under the constraint of a constant delay, so that synchronization is easier to maintain. Furthermore, it is important that such a constant delay be relatively small, e.g., as small approximately one a few video lines. This is particularly important in two-way video and voice communication, e.g., videoconferencing. Thus there is a need in the art for a method and for code and hardware to reduce or eliminate the noise in the difference picture under the constraint of constant delay.
Temporal noise reduction filtering such as temporal recursive filtering, that is recursive filtering picture-to-picture, e.g., frame to frame in the time domain, is known, and has proven to be an efficient noise reduction scheme that is used in video coding to preprocess the input moving picture sequence (the input video) prior to encoding. Such temporal filtering often is also used to post-process the output video prior to displaying.
The prior art motion compensated predictive coder 100 shown in FIG. 1 includes a temporal recursive filter 103 to filter the input picture sequence. The coder 100 is usable, for example, for MPEG-2, and for H.264/AVC coding, depending on such details as the type of transform used, the block size, the quantization, whether or not the de-blocking filter is present, and other details. Unfortunately, the temporal filter 103 generates an additional delay on one picture sequence interval, e.g., one frame interval, for one or both of the input and the output image, depending on whether such a filter is used at the input, the output, or both. This additional delay of 33 ms or 66 ms is generally acceptable for broadcast-quality TV application. However, it is not acceptable for such applications as videoconferencing where additional delay degrades the communication between participants.
Thus there is a need in the art for a method and for code and hardware to reduce or eliminate the noise in the difference picture under the constraint of constant delay, with the delay being relatively small so that, for example, two-way videoconferencing communication is not hampered.