The present invention relates to the quantization and compression of digital image data, and more specifically to image data containing inherent signal dependent noise and color digital image data.
Digital images contain huge amounts of data, especially for high quality display and printing. Compression is useful to reduce the time for transferring data in networks and for saving digital media storage space. Images may be monochromatic or colored and colored images may have two or more channels. The most common, images are RGB images (for display) and CMYK images (for print). The common source of most images is an input device such as a scanner or digital camera. Such input devices are two-dimensional instances of the class of applications where signal is captured by a light-sensitive device.
In video capture, digital photography and digital scanning, the image is acquired by means of an electronic sensor such as a charge-coupled device (CCD) or a complimentary metal-oxide semiconductor (CMOS) device with cells sensitive to electromagnetic radiation such as visible light, infrared or ultra-violet waves. In medical or astronomical imaging, images may also be acquired by X-ray, microwave or radiowave sensitive detectors. Such images will have inherent characteristic noise that is not a part of the original signal reaching the sensor.
A typical noise characteristic, as found in particle-counting devices, such as a CCD sensor, is mainly a combination of xe2x80x98dark currentxe2x80x99 shot noise and photon shot noise. CCD noise characteristics are described in Theuwissen, A. J. P, xe2x80x9cSolid-State Imaging with Charge-Coupled Devicesxe2x80x9d, Kluwer (1995), sections 3.4 and 8.1, incorporated by reference herein. Dark current shot noise is constant for all light levels and is therefore significant at low light levels, whereas photon shot noise is proportional to the square root of the light level and is therefore most significant at high light levels.
In most electronic light sensing devices, such as video cameras for example, photon shot noise is largely eliminated from the output by use of a logarithmic or similar analog amplifier placed before the analog-to-digital (A-D) converter. Differences due to noise of the order of the square root of the incident signal level are thereby compressed into a nearly noiseless signal. Such non-linear compression of image signals causes little loss of visual quality because of the way the human eye interprets images. Specifically, in this cases the dynamic range of the original lighting is usually many orders of magnitude larger than the dynamic range of the projection or display device. Since the human eye is non-linear, in fact close to logarithmic, in its response to different dynamic ranges, the original signal would have to be non-lineally compressed for viewing or printing, after A-D conversion, to maintain subjective fidelity.
In modern digital cameras and scanners, such as the Leaf Volare and the EverSmart Supreme scanner, both devices available from Scitex Corporation, Herzlia, Israel, the CCD signals are converted to 14 or more bit data without non-linear, analog amplification. Thus the digitized data is directly proportional to the number of electrons captured by the sensing device. In this case there is inherent characteristic noise in the digital data. It is foreseeable that other devices such as, for example, medical or astronomical devices will also make the change to digitizing data without non-linear amplification. Many-bit files are large and require extensive storage space. It is therefore desirable to compress the data, provided there will be little or no loss of signal information.
The need to compress has also arisen for high-quality, multi-component images such as RGB images (for display) or CMYK images (for print), consisting of 8 (eight) bits per component. If the origin of the data is captured light intensities as in scanners or digital cameras, the data are transformations of the original light intensities, designed to allow display or printing of the image in a visually optimal manner. With the proliferation of such images and the increase in digital communication, a need has arisen to compress them for transmission and storage.
The most common means for compressing multi-component images is by use of the JPEG standard. The JPEG standard is described in international standard IS 10918-1 (ITU-T T.81), available from the Joint Photographic Experts Group, http.//www.jpeg.org.
The lossy JPEG standard is a method that compresses images by impairing high frequency data that are not noticeable to the eye. Improvements of lossy JPEG have also been suggested which utilize subtler psychovisual criteria. A summary is provided in T. D. Tran and R. Safranek, xe2x80x9cA Locally Adaptive Perceptual Masking Threshold Model For Image Coding,xe2x80x9d Proc. IEEE Int Conf. on Acoustics, Speech, and Signal Processing, Vol. IV, pp. 1882-1886, Atlanta, May 1996. These improvements are intra-component additions. No use is made of inter-component, i.e. color, information.
JPEG compression can optionally utilize a color psychovisual criterion in case of compression of an RGB image. A matrix transformation can be applied to the RGB value to yield a luminance component and two color components. Then the color components can be optionally sampled at half the rate of the luminance component in the subsequent compression processing and the quantization tables for luminance and color may be different. This is in accordance with the fact that the human eye perceives color at less resolution than luminance.
The described procedure is however not optimal in three senses. First, the matrix transformation does not map the RGB image to a psychovisually uniform space if the RGB data are proportional to actual light intensities. Only if the RGB are logarithmic or power transformations of light intensities, is the image of the mapping approximately psychovisual and then only in the luminance component. Second, the lower rate of sampling of the color components completely eliminates color modulation of more than half the original frequency. Thirdly, the transformation cannot be used for color spaces with more than three components, such as CMYK data.
In addition lossy JPEG does not have the property that once-compressed and twice-compressed images are identical. This is desirable for applications where the file needs to be compressed and decompressed several times. Also JPEG compression cannot guarantee a maximum deviation of a decompressed pixel from the original pixel.
As an alternative to JPEG, digital image data may be losslessly or nearly losslessly compressed by a variety of compression algorithms. Fast and proven effective methods, especially tuned to compress images effectively, are described in U.S. Pat. Nos. 5,680,129 (Weinberger, et al.) and U.S. Pat. No. 5,835,034 (Seroussi, et al.). The methods disclosed in these patents form the basis for the draft standard JPEG-LS of ISO, ISO/IEC JTC1/SC29 WG1 (JPEG/JBIG) xe2x80x9cInformation Technologyxe2x80x94Lossless and Near-Lossless Compression of Continuous-Tone Still Images, Draft International Standard DIS14495-1 (JPEG-LS)xe2x80x9d(1998) (hereinafter referred to as xe2x80x9cDraft International Standard DIS14495-1xe2x80x9d), incorporated by reference herein, and attached hereto as Addendum I.
Another lossless/near-lossless method for images is described in Wu, xe2x80x9cLossless Compression of Continuous-Tone Images Via Context Selection, Quantization And Modelingxe2x80x9d, IEEE Transactions on Image Processing, Vol. 6, No. 5, May 1997.
The above methods are both differential pulse code modulation (DPCM) techniques. DCPM is a general signal compression technique based on the formulation of predictors for each signal value from nearby values, The difference between the predicted value and the true value is encoded. DPCM techniques in general and for images in particular, are described in Jain, A. K., xe2x80x9cFundamentals of Image Processingxe2x80x9d, Chapter 11, Prentice-Hall (1989), the entire work incorporated by reference herein.
DPCM techniques do not give good compression ratios in the presence of small variations in the data such as noise, since the performance of the predictors degrades. To increase the compression ratio, nearly lossless methods may be used.
Nearly lossless encoding with DPCM techniques is performed by uniform quantization of the prediction error using fewer values than required. For example, instead of using the full 511 values {xe2x88x92255, . . . , 0, . . . , +255} required for 8 bit data errors, 200 values with a suitable rule for encoding and decoding of the coarsely quantized differences can be used.
When quantizing a signal sequence, optimal quantization for a given distortion measure can be achieved if the characteristic frequency distribution of the values in a typical sequence is known. The optimal minimum mean square error quantizer is known as the Lloyd-Max quantizer. The Lloyd-Max quantizer is described in Jain (above) A common assumption for photographic images is uniform frequency over all values. For uniform signal value distribution, the Lloyd-Max quantizer leads to uniform quantization with all quantization intervals having equal size.
A general method for finding the optimal quantizer for data using a distortion measure based on the quantization error and the observed value is given in Sharma, D. J., xe2x80x9cDesign of Absolutely Optimal Quantizers for a Wide Class of Distortion Measuresxe2x80x9d, IEEE Trans. Information Theory, Vol. IT-24, No. 6, November 1978. Quantization schemes for non-Gaussian sources and sources that have undergone transformations or have been converted to predictions are discussed in Zamir, R. and Feder, M., xe2x80x9cInformation Rates of Pre/Post-Filtered Dithered Quantizersxe2x80x9d, IEEE Trans. Information Theory, Vol. IT-42, No 5, September 1996.
In all the above quantizing schemes, the criterion for finding the quantizer is based on the error between the quantized signal and the observed signal. These methods are also valid when the observations suffer from noise that is not dependent on the amplitude of the signal. However, for signals with inherent signal-dependent noise, a desirable criterion for quantizing should take into account the error between the quantized signal and the actual (unobserved) signal that gave rise to the observation. The above quantizers are not optimal for data with noise that is signal dependent when the criterion for quantization is a function of the difference between the actual unobserved signal and the quantized observation. In particular, use of the above quantizers for quantizing the difference between predictor and observed value in DPCM techniques is not optimal for data with signal dependent: noise using such a criterion.
Al-Shaykh,O. K. and Mersereau, R. M., xe2x80x9cLossy Compression of Noisy Imagesxe2x80x9d, IEEE Trans. Image Processing, Vol. 7, No. 12, December 1998, have suggested a method for quantization wherein the image data is corrected by a mean square error estimate and subsequently quantized with a Lloyd-Max quantizer. This is optimal according to a certain criterion. This method does not, however, guarantee a maximum deviation. It also implies that the quantization, after estimation, is uniform for arbitrary images, unless the images are processed twice, once for statistics and a second time for the quantization. This particular solution is dictated by the choice of optimality criterion. It would be advantageous to find a different, justifiable criterion leading to a non-uniform quantization with better potential compression ratio, while still guaranteeing a maximum deviation.
FIG. 1 shows the basic building blocks of a signal capture device 5 having a radiation-sensitive sensor 20. The radiating or reflecting source 10 is imaged through the optics 15 of the capture device onto the focal plane of the sensor 20. The incident radiation causes a voltage to be generated in the sensor that is read by the electronic circuitry 25 and converted to a digital number by the analog to digital converter (A-D converter) 30. The value provided by the A-D converter 30 is transferred to computer memory 35 where it represents the light or other radiation captured on the sensor corresponding to a specific location in the original radiating or reflecting source.
FIGS. 2A-2D show examples of different types of sensors. The photomultiplier tube 40 is found in commercial drum scanning devices such as the Heidelberg Tango scanner, available from Heidelberg Druckmaschinen AG of Heidelberg, Germany. The one-dimensional CCD array 45 is found in commercial flat bed scanning devices such as the aforementioned Scitex EverSmart Scanner. The two-dimensional CCD array 50 is found in modern digital still cameras, such as the Leaf Volare. The two-dimensional CCD array may have a mosaic pattern of color filters 55 deposited on the array to provide a basis for color images. The repeating color pattern 55 is sometimes referred to as the Bayer mosaic pattern.
As mentioned above the two primary sources of noise in a radiation-sensing device are dark current shot noise and photon shot noise. Photon shot noise depends on the number of photons captured at a site 52 in the sensor array 50 while dark current shot noise depends on the degree of spontaneous noise generated by. undesired current present at the sensing site 52, independent of the amount of radiation incident on it.
Both of these types of noise may be approximately modeled by a Poisson probability distribution. Poisson probability distributions are described in Wilks, S. S., xe2x80x9cMathematical Statisticsxe2x80x9d, John Wiley (1962).
For photon shot noise, the mean and variance of the observed value x at a site are the actual signal value s. For dark current noise the mean and variance of the distribution are a temperature-dependent value a which is independent of the incident signal. The total noise then causes the observed value x to have a Poisson distribution P(X=x|S=s) abbreviated P(x|s) with mean and variance, expressed as s+xcex1.
If the sensing device has circuitry which performs a non-linear transformation f(x) on the observed value x, the transformation may be known or measured by experiment and the noise distribution P(f(x)|s), its mean and its variance may be computed by elementary statistical methods such as described in Wilks, xe2x80x9cMathematical Statisticsxe2x80x9d (above).
It would be desirable when compressing signals having inherent noise to throw away information about the noise while retaining information about the signal.
With reference to captured signals that result in images for display (RGB) or print (CMYK), the above mentioned prior-art quantizers do not take into account important psychophysical effects. They may allocate more bits to colors that the eye can barely distinguish than to colors the eye sees as significantly different. In particular, use of the above quantizers for quantizing the difference between predictor and observed value in DPCM techniques is not optimal for image data if the distortion measure includes a psychophysical criterion.
Also, the above quantizers do not take into account the transformation applied to many-bit files before representing them in display or print medium. As mentioned above, many-bit files are often linearly proportional to light intensities which may have much larger dynamic range than the display or print medium. In such cases the data will be non-linearly transformed at a subsequent stage to correctly represent original intensities in the reduced range medium. These transformations are often of the gamma type, i.e. raising the signal to a given exponent, often close to ⅓. Thus use of the above quantizers for quantizing the difference between predictor and observed value in DPCM techniques is not optimal for data that are destined to undergo non-linear transformations.
The CIE 1976 (L*, a*, b*) color space, abbreviated CIELab, is a standard color space which is approximately uniform to human visual perception. This means that spatial distance between colors in CIELab space approximates the visual sensation of the difference between them. CIELab is described in Fairchild, M. D., xe2x80x9cColor Appearance Modelsxe2x80x9d, Addison-Wesley (1998), Chapter 3.9 (entitled: Color-difference Specification), incorporated by reference herein. Tristimulus value space XYZ is a color space from which CIELab is derived. XYZ is also described in Fairchild, xe2x80x9cColor Appearance Modelsxe2x80x9d (above).
International Color Consortium (ICC) source profiles are used to describe the color of input devices such as digital cameras and scanners. ICC profiles are described in Fairchild, xe2x80x9cColor Appearance Modelsxe2x80x9d (above) and in the ICC specification xe2x80x9cSpec ICC.1: 1998-09, File Format for Color Profilesxe2x80x9d, http://www.color.org, hereinafter xe2x80x9cSpec ICC.1xe2x80x9d, incorporated by reference herein. The source profiles therein provide an approximate mapping between the RGB color space defined by the sensing device""s spectral responses, in predefined lighting and subject matter setups, to a profile connection space which may be either tristimulus value space XYZ or CIELab space. ICC source profiles also provide a mapping from multi-component files such as CMYK files to XYZ or CIELab. For accurate profiles a three-dimensional lookup table is required. Since the ICC profile format permits such a table only when the profile connection space is CIELab, we assume this is the case.
ICC profiles are embedded in captured image files to provide for color consistency across color display and color printing devices. When a captured image file is to be viewed or printed, the image data are processed by first applying the source profile. The source profile maps the image data to the device-independent profile connection CIELab. Then, an output profile is applied that maps the CIELab data to the output color space. This may also be done by first concatenating the profiles and then applying the concatenated profile. In any case, all subsequent visible manifestations of the original captured image will have passed through the source profile mapping to CIELab. CIELab is thus the universal, standard, approximately psychovisually uniform space in which color is described according to current ICC specifications.
The present invention improves on and overcomes problems associated with the Conventional art. The invention provides a method for quantizing image data with signal-dependent noise to a fewer number of values so that noise information is discarded while signal information is maintained. This is achieved by deriving a set of equations, the solution to which may be numerically computed and whose solution constitutes an optimal relative mean square error quantization for a signal with noise. Two low complexity explicit heuristic formulas for quantization are also provided.
The invention also provides a method for compressing image files containing signal dependent noise by means of near lossless compression so as to maintain signal information in the compressed data while discarding noise information. This is achieved by integrating a new quantization method into a standard DPCM method for image compression. The new quantization method quantizes differences between a predictor and an actual value in a signal dependent manner so that the amount of quantization is proportional to the amount of noise in the signal, thus effecting relative uniform allocation of quantization error.
The invention also provides a method for compressing multi-component color image data so that psycho-visually significant color information is maintained in the compressed data while other color information is discarded and a method for compressing multi-component color image data which is linearly proportional to high dynamic range light intensities so that color information significant for display and printing is maintained in the compressed data while other color information is discarded. These methods are achieved by providing an explicit measure of deviation contributed by quantization of color differences, this measure expressed in terms of L, a and b values but computable in terms of the source data.
There is also provided a method for compression of multi-component color image files containing embedded ICC profiles in such away as to distribute distortion of the visual appearance of such data in predefined proportions across the source image components. This is achieved by integrating a new quantization method into a standard DPCM method for image compression. The new quantization method quantizes color differences between a predictor color and an actual color by means of the embedded source profile in a manner such that the error is psychovisually limited and distributed between the source components in predefined proportions.
Additionally, there are provided methods of compression and decompression that do not degrade the image by successive compression-decompression cycles but yield identical data for each cycle following the first cycle and methods of compression and decompression that guarantee the existence of a maximum deviation from the original. These compression and decompression methods are achieved by implementation of compression using the above quantization methods in embodiments of a generic DPCM nature. The DPCM technique ensures that second and further cycles of compression and decompression provide identical decompressed signals. Use of near lossless quantization ensures that the amount of quantization never exceeds a predetermined amount at any location.
The present invention is directed to a method for quantizing a signal having a signal dependent noise distribution. In this method, there includes the step of establishing a plurality of quantization levels, each of these quantization levels comprising left, right and representative values, and an interval length defined by the difference between said right and left values, such that at least one interval length is different from at least one other interval length, The left, right and representative values for each of said quantization levels are then adjusted such that for each interval, the interval length is correlated with the signal dependent noise distribution at the representative value.
There is also disclosed a method for quantizing a signal having a signal dependent noise distribution. This method involves the steps of establishing a plurality of quantization levels, each of these quantization levels comprising left, right and representative values, and an interval length defined by the difference between the right and left values. This is followed by computing the left, right and representative values for each of the quantization levels such that each pair of successive intervals is in accordance with the relation:
(T(k+1)xe2x88x92T(k))/(T(k)xe2x88x92T(kxe2x88x921))=g[R[k]]/g[R[kxe2x88x921]]
where:
g[s] is the standard deviation of the noise distribution for signal amplitude s;
R[i], i=1, . . . , L are said representative values; and
T[i], T[i+1] i=1, . . . , L are said left and right values corresponding to said representative value R[i].
Also disclosed is a method for compressing a signal sequence having a signal dependent noise distribution, that includes the steps of establishing a plurality of quantization levels, each of the quantization levels comprising left, right and representative values, and an interval length defined by the difference between the right and left values, such that at least one interval length is different from at least one other interval length. The left, right and representative values for each of the quantization levels is then adjusted, such that for each of the intervals, the interval length is correlated with the signal dependent noise distribution at the representative values. The representative, left and right values for each interval are then encoded into a bit stream by means of an entropy encoder. For each obtained signal value S, predicted signal value SP is formed by means of predicted signal values preceding the signal value S in the sequence. The difference d between the predicted signal value SP and the signal value S is then computed. The difference d is divided by the length ILP of the intervals containing the predicted signal value SP to obtain quantized difference count QDC. The difference d is replaced with the said quantized difference count QDC, and the quantized difference QDC is encoded into the bit stream by means of an entropy encoder. The resultant bit stream is then saved.
There is also disclosed a method for quantizing the differences D between components of a first tuplet of color component values T1 and a second tuplet of color component values T2 from a first multi-component color space MCS1, using a mapping from the space MCS1 to a second multi-component color space MCS2 and a measure of distance DM2 in the space MCS2, the distance measure having a tuplet of gradients DM2DC with respect to the component directions in space MCS2. This comprises the steps of selecting a distance threshold DELTA2 in the space MCS2 and selecting a tuplet of weighting coefficients W for each component of the space MCS1. The tuplet T1 is mapped to the space MCS2 with the mapping from the space MCS1 to the space MCS2 to obtain tuplet T1xe2x80x94in M2. For each component in the space MCS2, the tuplet of gradients DM2DC of said measure of distance DM2 in space MCS2 with respect to each component direction in MCS2 at the tuplet T1xe2x80x94in M2 is evaluated. For each component in the space MCS1, the tuplet of gradients DT2DT1 of the mapping of space MCS1 with respect to each component direction in MCS2 at the tuplet T1 is evaluated. The scalar product SP2 of said tuplets DT2DT1 with said gradients DM2DC is computed. A tuplet QL of quantization lengths for the component directions in MCS1 such that the ratios R between all pairs of lengths in the tuplet QL is equal to the ratios between the corresponding pairs of weighting coefficients in said tuplet W is then established and the scalar product SP1 of the product SP2 with the tuplet QL computed. The tuplet QL is adjusted to obtain adjusted tuplet QLxe2x80x2 such that the scalar product SP1 is equal to DELTA2 and the ratios R are unchanged. Each component of the differences D is divided by the corresponding component of the adjusted tuplet QLxe2x80x2 to obtain nearest integral values INT. The integral values INT are then multiplied with corresponding components of the adjusted tuplet QLxe2x80x2 to obtain quantized differences QD, and differences D are replaced by the quantized differences QD.
Finally, there is disclosed a method for compressing a color signal sequence in a first multi-component color space MCS1, having a mapping to a second multi-component color space MCS2 and a measure of distance D2 in said space MCS2. This method includes the steps of selecting a distance threshold DELTA2 in the space MCS2 and selecting a tuplet of weighting coefficients W for each component of the space MCS1. DELTA2 and the W are then encoded into a bit stream by means of an entropy encoder. For each multi-component signal tuplet S in the color signal sequence, a predicted signal tuplet PST of an actual signal tuplet T is formed by means of already coded signal tuplets preceding the actual signal tuplet T in the sequence. The differences D between the predicted signal tuplet PST and the actual signal tuplet are then computed and the tuplet PST to the space MCS2 are mapped with the mapping from the space MCS1 to the space MCS2 to obtain tuplet PSTinM2. For each component in the space MCS2, the tuplet of gradients DM2DC of the measure of distance DM2 in space MCS2 with respect to each component direction in MCS2 at the tuplet PST_in_M2 is evaluated and for each component in the space MCS1, the tuplet of gradients DT2DT1 of the mapping of space MCS1 with respect to each component direction in MCS2 at the tuplet T1 is evaluated. The scalar product SP2 of said tuplets DT2DT1 with said gradients DM2DC is then computed and a tuplet QL of quantization lengths for the component directions in MCS1 such that the ratios R between all pairs of lengths in said tuplet QL are equal to the ratios between the corresponding pairs of weighting coefficients in the tuplet W is established. The scalar product SP1 of said product SP2 with said tuplet QL is computed and the tuplet QL is adjusted to obtain adjusted tuplet QLxe2x80x2 such that the scalar product SP1 is equal to the DELTA2 and the ratios R are unchanged. Each component of the differences D is divided by the corresponding component of the adjusted tuplet QLxe2x80x2 to obtain nearest integral values INT, and integral values INT are encoded into a bit stream by means of an entropy encoder. The resultant bit stream is saved.