1. Field of the Invention
The present invention relates to data compression.
2. Description of the Prior Art
Data compression techniques are used extensively in the data communications field in order to communicate data at bit rates that can be supported by communication channels having dynamically changing but limited bandwidths. Image data is typically compressed prior to either transmission or storage on an appropriate storage medium and it is decompressed prior to image reproduction.
In the case of still images data compression techniques take advantage of spatial redundancy, whilst for moving images both spatial and temporal redundancy is exploited. Temporal redundancy arises in moving images where successive images in a temporal sequence, particularly images belonging to the same scene, can be very similar. The Motion Picture Experts Group (MPEG) has defined international standards for video compression encoding for entertainment and broadcast applications. The present invention is relevant (though not at all restricted to) to implementations of the MPEG4 xe2x80x9cStudio Profilexe2x80x9d standard that is directed to high end video hardware operating at very high data rates (up to 1 Gbit/s) using low compression ratios.
Discrete Cosine Transform (DCT) Quantisation is a widely used encoding technique for video data. It is used in image compression to reduce the length of the data words required to represent input image data prior to transmission or storage of that data. In the DCT quantisation process the image is segmented into regularly sized blocks of pixel values and typically each block comprises 8 horizontal pixels by 8 vertical pixels (8Hxc3x978V). In conventional data formats video data typically has three components that correspond to either the red, green and blue (RGB) components of a colour image or to a luminance component Y along with two colour difference components Cb and Cr. A group of pixel blocks corresponding to all three RGB or YCbCr signal components is known as a macroblock (MB).
The DCT represents a transformation of an image from a spatial domain to a spatial frequency domain and effectively converts a block of pixel values into a block of transform coefficients of the same dimensions. The DCT coefficients represent spatial frequency components of the image block. Each coefficient can be thought of as a weight to be applied to an appropriate basis function and a weighted sum of basis functions provides a complete representation of the input image. Each 8Hxc3x978V block of DCT coefficients has a single xe2x80x9cDCxe2x80x9d coefficient representing zero spatial frequency and 63 xe2x80x9cACxe2x80x9d coefficients. The DCT coefficients of largest magnitude are typically those corresponding to the low spatial frequencies. Performing a DCT on an image does not necessarily result in compression but simply transforms the image data from the spatial domain to the spatial frequency domain. In order to achieve compression each DCT coefficient is divided by a positive integer known as the quantisation divisor and the quotient is rounded up or down to the nearest integer. Larger quantisation divisors result in higher compression of data at the expense of harsher quantisation. Harsher quantisation results in greater degradation in the quality of the reproduced image. Quantisation artefacts arise in the reproduced images as a consequence of the rounding up or down of the DCT coefficients. During compressed image reproduction each DCT coefficient is reconstructed by multiplying the quantised coefficient (rounded to the nearest integer), rather than the original quotient, by the quantisation step which means that the original precision of the DCT coefficient is not restored. Thus quantisation is a xe2x80x9clossyxe2x80x9d encoding technique.
Image data compression systems typically use a series of trial compressions to determine the most appropriate quantisation divisor to achieve a predetermined output bit rate. Trial quantisations are carried out at, say, twenty possible quantisation divisors spread across the full available range of possible quantisation divisors. The two trial adjacent trial quantisation divisors that give projected output bit rates just above and just below the target bit rate are identified and a refined search is carried out between these two values. Typically the quantisation divisor selected for performing the image compression will be the one that gives the least harsh quantisation yet allows the target bit rate to be achieved.
Although selecting the least harsh quantisation will result in the best possible image quality (i.e. the least noisy image) on reproduction for xe2x80x9csourcexe2x80x9d image data that has not undergone one or more previous compression/decompression cycles, it has been established that this is not necessarily the case for xe2x80x9cnon-sourcexe2x80x9d image data. An image that has been compressed and decompressed once is referred to as a 1st generation image, an image that has been subject to two previous compression/decompression cycles is known as a 2nd generation and so on for higher generations.
Typically the noise in the image will be systematically higher across the full range of quantisation divisors for the 2nd generation reproduced image in comparison to the noise at a corresponding quantisation divisor for the 1st generation reproduced image. This can be understood in terms of the DCT coefficient rounding errors incurred at each stage of quantisation. However, it is known that when the 2nd generation quantisation divisor is chosen to substantially equal to that used in the 1st generation compression, the noise levels in the 2nd generation reproduced image will be substantially equal to the noise levels in the 1st generation reproduced image. Thus for non-source input image data the quantisation divisor having the smallest possible magnitude that meets a required data rate will not necessarily give the best reproduced image quality. Instead, a quantisation divisor substantially equal to that used in a previous compression/decompression cycle is likely to give the best possible reproduced image quality. Note however that the choice of quantisation divisor is constrained by the target bit rate associated with the particular communication channel which may vary from generation to generation.
To ensure that the xe2x80x9cbestxe2x80x9d quantisation step is selected for so-called multi-generation images a backsearch process is used which starts with the quantisation value identified by the trial quantisations referred to above and performs further checks using harsher quantisation (quantisation divisors of larger magnitude). For each of these backsearch quantisation steps the data is quantised and subsequently dequantised. Alternatively, instead of dequantising, the error is calculated from the quantiser residual. The dequantised data is compared with a delayed (unquantised) version of the input image data and the backsearch quantisation step that results in the fewest errors on this comparison is selected for the final output stage of quantisation.
A problem arises with the backsearch process in the case where the Macro-Block target bit count for a 2nd generation image is less than the Macro-Block target bit count for a corresponding 1st generation image. As a consequence of the discrepancy between the target bit counts the 2nd generation image data is quantised more harshly (using a larger quantisation divisor) than the 1st generation image data. FIG. 1 schematically illustrates this problem. The backsearch process starts with the quantisation divisor Q_SCALE=Q_ALLOC that was selected by a bit allocation/binary search process in accordance with the Macro-Block target bit count. The backsearch also involves testing a series of, say 12, harsher quantisations QB1 to QB12 than the quantisation corresponding to Q_ALLOC. The quantisation divisor (of those tested) that results in the least noise in the reproduced image Q_FINAL is determined. Thus the quantisation divisor Q1 corresponding to the 1st generation compression which is less harsh (smaller quantisation divisor) than Q_ALLOC is not found by the backsearch although Q1 is the value that is likely to give the best image quality for the reproduced 2nd generation image.
Although in principle this problem could be addressed by adjusting the starting point of the 2nd generation backsearch so that it corresponds to a smaller quantisation divisor than Q_ALLOC thus the backsearch is more likely to encompass Q1, this may result in selection of a smaller quantisation divisor. This would likely give rise to a larger bit rate than the communication system is capable of handling. An encoding bit rate that exceeded the predetermined maximum encoding bit rate could lead to unacceptable data loss.
This invention provides a data compression apparatus comprising:
a source detection arrangement for detecting whether or not the input data is source data that has not undergone a previous compression/decompression cycle;
a data quantity generator, responsive to the source detection arrangement, for setting a desired data output quantity for the compressed data, the desired data quantity having a first value for source input data and a second, higher, value for non-source input data;
a target allocator for allocating a target data quantity to respective subsets of the input data in dependence upon the desired output data quantity, the target data quantities together providing a desired output data quantity; and
a data compression arrangement for compressing each subset of the input data in accordance with its respective target data quantity.
The invention addresses the problems outlined above by setting the desired data quantity to a lower level for data which has not been previously compressed and decompressed. The difference can be made sufficiently small (e.g. 5%) so as not to be subjectively noticeable, but gives some headroom for the subsequent generation compression stages to achieve the same degree of quantisation as that used in the first generation. This can reduce the error rate in the subsequent generations.