Digital images are at present used in several applications, digital photography being a case in point.
In normal usages digital images are generally made to undergo a compression and encoding procedure. This procedure, also referred to more simply as compression, reduces the occupied memory quantity and makes it possible, for example, to increase the maximum number of images that can be simultaneously stored in the memory unit of a digital still camera. Furthermore, compression promotes shorter transmission times when the images have to be transferred to some external peripheral device or, more generally, on telecommunication networks such as—for example—the Internet.
The most common and efficient compression methods at present employed are based on the transform of the images into the two-dimensional spatial frequency domain, especially the so-called discrete cosine transform (or DCT). An example of this type is represented by the system defined by the specifications of the JPEG (Joint Photographic Expert Group) international standard for the compression/encoding of images (ISO/CCITT).
Proposing a generic and flexible compression system, this standard really defines several compression methods that can all be derived from two basic methods. One of these, the so-called JPEG baseline, employs the DCT and compression of the “lossy” type, i.e. with loss of information. The present invention concerns this method and, more generally, compression methods that use the DCT or such similar two-dimensional spatial transforms as the discrete wavelet transform (DWT).
A digital image can be represented by means of a matrix of elements, known as pixels, each of which corresponds to an elementary portion of the image and comprises one or more digital values each associated with an optical component. In a monochromatic image, for example, just a single value is associated with each pixel, and in this case it is usually said that the image consists of just a single channel or plane.
In a coloured RGB image, on the other hand, associated with each pixel there are three digital values that correspond to the three components (red, green, blue) of additive chromatic synthesis. In this case the image can be decomposed into three distinct planes or channels, each of which contains the information relating to just a single chromatic component.
A compression algorithm that employs the DCT operates separately and independently on the planes that make up the image; these planes are subdivided into sub-matrices of size 8×8 pixels, each of which is then transformed by means of the DCT.
For each sub-matrix (or sub-block) there is obtained an 8×8 matrix of which the elements, the so-called DCT coefficients, correspond to the amplitudes of orthogonal waveforms that define the representation of the sub-block in the two-dimensional DCT spatial frequency domain. In practice, therefore, each DCT coefficient, identified by indices (i,j), represents the amplitude of the DCT spatial frequency identified by the indices (i,j) associated with the coefficient. In the spatial frequency domain the compression algorithm reduces the information content by selectively attenuating or eliminating certain frequencies.
The reduction of the information quantity is obtained by dividing the DCT coefficient matrices by an 8×8 matrix of integer quantization coefficients: in practice each DCT coefficient is divided by the corresponding quantization coefficient and the result is then rounded off to the nearest integer. Due to the division and rounding-off operations and depending also on the actual values of the quantization coefficients, the “quantized” matrices obtained in this way contain a certain number of zero elements. When these matrices, which generally contain many coefficients equal to zero, are encoded—as is the case, for example, in the JPEG standard—by means of a Run Length encoding and subsequently by means of a Huffmann encoding, the memory occupation becomes reduced without any further information losses being suffered.
Quantization essentially reduces the precision of the DCT coefficients. The greater the values of the coefficients of the quantization matrix, the greater will be the information reduction quantity. Since there is no way of restoring the eliminated original information, an inappropriate quantization can appreciably deteriorate the quality of the image.
Optimization of the quantization matrices makes it possible to improve the performance of the compression algorithm by introducing some kind of compromise between final image quality and compression efficiency.
A characterization of the quality deterioration introduced into a digital image by a compression algorithm is provided by the so-called PSNR (Peak-to-Peak Signal to Noise Ratio), which is a measure in dB of the quantity of noise introduced by the algorithm at a given compression ratio. The compression ratio of an algorithm, on the other hand, is measured in terms of bit rates. The bit rate represents the number of bits that are needed to represent a pixel in the compressed and encoded image.
The JPEG standard suggests the use of quantization matrices synthesized on the basis of perceptive criteria that take due account of the sensitivity of the human eye to the DCT spatial frequencies. It has been shown that the use of these matrices gives rise to considerable artifacts when the (decoded/decompressed) images are displayed on high-resolution displays.
The prior art includes numerous attempts that were made—using different approaches—with a view to pinpointing and synthesizing optimal quantization matrices. The best results were obtained with adaptive or iterative procedures that operate on the basis of statistical, contentual and perceptive criteria. These methods obtain the optimization—albeit in some cases with a considerable computational effort—by supposing that the operation is being performed in an ideal context, i.e. without taking account of the effective degradation introduced into the digital image during the acquisition phase and the processing phases that precede compression. For this reason, solutions that in an ideal context would constitute the best trade-off between perceptive quality of the decoded/decompressed image and compression efficiency will produce non-optimal results when applied in some real context such as a digital still camera or an image scanner.