The present invention relates to the compression of a digital video signal for such purposes as transmission over a communication channel or recording on magnetic media.
Digital video offers advantages such as versatile signal processing and error-free reproduction, but digitization of a video signal yields data in such large amounts as to quickly exhaust the capacity of communication channels or recording media. Before transmission or recording, the digitized video signal must therefore be compressed. Compression techniques based on orthogonal transforms, motion estimation, and variable-length encoding are well known.
A difficulty that confronts video compression schemes is that the degree of compression varies depending, for example, on the amount of redundancy in the signal. Since transmission or recording is carried out at a constant bit rate, it is necessary to generate encoded data at a constant bit rate over some suitable time scale. That is, it must be possible to divide the digital video signal into segments of equal length and encode each segment to a constant amount of compressed data.
One conventional method of obtaining a constant amount per segment is to provide a buffer memory in which the encoded data for each segment are stored prior to transmission or recording. As data are written in the buffer memory, the rate at which the buffer memory is filling up is monitored. If the buffer memory is filling too quickly, the quantization level of the data is reduced, thereby reducing the amount of encoded data generated. If the buffer memory is filling too slowly, the quantization level is raised.
One problem of this method is that it is prone to buffer overflow, as tends to occur in very active images, or when the scene changes. Another problem is that varying the quantization level in this way tends to cause serious image degradation, particularly in images comprising a few high-contrast elements such as lines or edges disposed on a generally flat background. Degradation occurs because the high-contrast lines or edges generate a comparatively large amount of encoded data, requiring a reduction of the quantization level, but this in turn causes quantization noise that is highly visible and annoying on the flat background.
An alternative scheme is to predict, prior to quantization, the amount of encoded data that will be produced, and vary the quantization level accordingly. A variety of prediction schemes have been proposed, but they suffer from various problems. Prediction from the AC power level of the digital video signal, for example, leads to the same type of image degradation as described above.
The preceding problems of overflow and image degradation are two of the problems addressed by the present invention. Another problem concerns the motion estimation method of compressing a digital video signal comprising a luminance component (Y) and two chrominance components (color difference components: B-Y and R-Y). When the signal is digitized these components are generally sampled at different rates, the 4:2:2 component ratio being one widely-used standard. The resulting problem is that blocks of chrominance data cover a larger image area than blocks of luminance data, so their motion vectors are different. Providing independent motion estimation processors for the luminance and chrominance components is a computationally expensive solution to this problem, whereas using data blocks of different sizes for the luminance and chrominance components would lead to a discrepancy in encoding accuracy.