This invention relates to transform coding of digital data, specifically to real domain processing of transform data. More particularly, this invention relates to reduced-error digital processing of inverse transformed data.
Transform coding is the name given to a wide family of techniques for data coding, in which each block of data to be coded is transformed by some mathematical function prior to further processing. A block of data may be a part of a data object being coded, or may be the entire object. The data generally represent some phenomenon, which may be for example a spectral or spectrum analysis, an image, an audio clip, a video clip, etc. The transform function is usually chosen to reflect some quality of the phenomenon being coded; for example, in coding of audio, still images and motion pictures, the Fourier transform or Discrete Cosine Transform (DCT) can be used to analyze the data into frequency terms or coefficients. Given the phenomenon being coded, there is generally a concentration of the information into a few frequency coefficients. Therefore, the transformed data can often be more economically encoded or compressed than the original data. This means that transform coding can be used to compress certain types of data to minimize storage space or transmission time over a communication link.
An example of transform coding in use is found in the Joint Photographic Experts Group (JPEG) international standard for still image compression, as defined by ITU-T Rec. T.81 (1992)|TSO/IEC 10918-1:1994, Information technologyxe2x80x94Digital compression and coding of continuous-tone still images, Part 1: Requirements and Guidelines. Another example is the Moving Pictures Experts Group (MPEG) international standard for motion picture compression, defined by ISO/IEC 11172:1993, Information Technologyxe2x80x94Coding of moving pictures and associated audio for digital storage media at up to about 1,5 Mbits/s. This MPEG-1 standard defines systems for both video compression (Part 2 of the standard) and audio compression (Part 3). A more recent MPEG video standard (MPEG-2) is defined by ITU-T Rec. H.262|ISO/IEC 13818-2: 1996 Information Technologyxe2x80x94of moving pictures and associated audioxe2x80x94Part 2: video. A newer audio standard is ISO/IEC 13818-3: 1996 Information Technologyxe2x80x94Generic Coding of moving pictures and associated audioxe2x80x94Part 3: audio. All three image international data compression standards use the DCT on 8xc3x978 blocks of samples to achieve image compression. DCT compression of images is used herein to give illustrations of the general concepts put forward below; a complete explanation can be found in Chapter 4 xe2x80x9cThe Discrete Cosine Transform (DCT)xe2x80x9d in W. B. Pennebaker and J. L. Mitchell, JPEG: Still Image Data Compression Standard, Van Nostrand Reinhold: New York, (1993).
Wavelet coding is another form of transform coding. Special localized basis functions allow wavelet coding to preserve edges and small details. For compression the transformed data is usually quantized. Wavelet coding is used for fingerprint identification by the FBI. Wavelet coding is a subset of the more general subband coding technique. Subband coding uses filter banks to decompose the data into particular bands. Compression is achieved by quantizing the lower frequency bands more finely than the higher frequency bands while sampling the lower frequency bands more coarsely than the higher frequency bands. A summary of wavelet, DCT, and other transform coding is given in Chapter 5 xe2x80x9cCompression Algorithms for Diffuse Dataxe2x80x9d in Roy Hoffman, Data Compression in Digital Systems, Chapman and Hall: New York, (1997).
In any technology and for any phenomenon represented by digital data, the data before a transformation is performed are referred to as being xe2x80x9cin the real domainxe2x80x9d. After a transformation is performed, the new data are often called xe2x80x9ctransform dataxe2x80x9d or xe2x80x9ctransform coefficientsxe2x80x9d, and referred to as being xe2x80x9cin the transform domainxe2x80x9d. The function used to take data from the real domain to the transform domain is called the xe2x80x9cforward transformxe2x80x9d. The mathematical inverse of the forward transform, which takes data from the transform domain to the real domain, is called the respective xe2x80x9cinverse transformxe2x80x9d.
In general, the forward transform will produce real-valued data, not necessarily integers. To achieve data compression, the transform coefficients are converted to integers by the process of quantization. Suppose that (xcexi) is a set of real-valued transform coefficients resulting from the forward transform of one unit of data. Note that one unit of data may be a one-dimensional or two-dimensional block of data samples or even the entire data. The xe2x80x9cquantization valuesxe2x80x9d (qi) are parameters to the encoding process. The xe2x80x9cquantized transform coefficientsxe2x80x9d or xe2x80x9ctransform-coded dataxe2x80x9d are the sequence of values (ai) defined by the quantization function Q:                               a          i                =                              Q            ⁡                          (                              λ                i                            )                                =                      ⌊                                                            λ                  i                                                  q                  i                                            +              0.5                        ⌋                                              (        1        )            
where |x| means, as usual, the greatest integer less than or equal to x. The resulting integers are then passed on for possible further encoding or compression before being stored or, transmitted. To decode the data, the quantized coefficients are multiplied by the quantization values to give new xe2x80x9cdequantized coefficientsxe2x80x9d (xcexixe2x80x2) given by
xcexixe2x80x2qiai.xe2x80x83xe2x80x83(2)
The process of quantization followed by dequantization (also called inverse quantization) can thus be described as xe2x80x9crounding to the nearest multiple of qixe2x80x9d. The quantization values are chosen so that the loss of information in the quantization step is within some specified bound. For example, for audio or image data, one quantization level is usually the smallest change in data that can be perceived. It is quantization that allows transform coding to achieve good data compression ratios. A good choice of transform allows quantization values to be chosen which will significantly cut down the amount of data to be encoded. For example, the DCT is chosen for image compression because the frequency components which result produce almost independent responses from the human visual system. This means that the coefficients relating to those components to which the visual system is less sensitive, namely the high-frequency components, may be quantized using large a quantization values without perceptible loss of image quality. Coefficients relating to components to which the visual system is more sensitive, namely the low-frequency components, are quantized using smaller quantization values.
The inverse transform also generally produces non-integer data. Usually the decoded data are required to be in integer form. For example, systems for the playback of audio data or the display of image data generally accept input in the form of integers. For this reason, a transform decoder generally includes a step that converts the non-integer data from the inverse transform to integer data, either by truncation or by rounding to the nearest integer. There is also often a limit on the range of the integer data output from the decoding process in order that the data may be stored in a given number of bits. For this reason the decoder al so often includes a xe2x80x9cclippingxe2x80x9d stage that ensures that the output data are in an acceptable range. If the acceptable range is [a,b], then all values less than a are changed to a, and all values greater than b are changed to b.
These rounding and clipping processes are often considered an integral part of the decoder, and it is these which are the cause of inaccuracies in decoded data and in particular when decoded data are re-encoded. For example, the JPEG standard (Part 1) specifies that a source image sample is defined as an integer with precision P bits, with any value in the range 0 to 2**Pxe2x88x921. The decoder is expected to reconstruct the output from the inverse discrete cosine transform (IDCT) to the specified precision. For the baseline JPEG coding P is defined to be 8; for other DCT-based coding P can be 8 or 12. The MPEG-2 video standard states in Annex A (Discrete cosine transform) xe2x80x9cThe input to the forward transform and the output from the inverse transform is represented with 9 bits.xe2x80x9d
For JPEG the compliance test data for the encoder source image test data and the decoder reference test data are 8 bit/sample integers. Even though rounding to integers is typical, some programming languages convert from floating point to integers by truncation. Implementations in software that accept this conversion to integers by truncation introduce larger errors into the real-domain integer output from the inverse transform.
The term xe2x80x9chigh-precisionxe2x80x9d is used herein to refer to numerical values which are stored to a precision more accurate than the precision used when storing the values as integers. Examples of high-precision numbers are floating-point or fixed-point representations of numbers.
In light of the problems described above regarding inaccuracies caused by digital processing techniques and by such things as rounding and clipping after the inverse transform of transform data, one aspect of this invention provides a method for processing transform data in the real domain. This method reduces the undesired errors in the data produced by such things as rounding to integers and clipping to an allowed range after the inverse transform. In an embodiment, this method includes: performing the inverse transform of the transform data such that the real-domain data produced are in the form of high-precision numbers; converting the high-precision numbers to integers and clipping to an allowed range; subtracting the converted and clipped integers from the high-precision numbers forming high-precision differences; manipulating these converted and clipped integers; and adding the manipulated integers to the high-precision differences to create manipulated high-precision numbers after the processing stage is complete.
It is another aspect of this invention to provide a method for processing transform-coded data in the real domain which reduces the undesired errors in the data produced by the converting to integers and clipping to an allowed range after the inverse transform. In an embodiment, the method includes: performing the inverse quantization of the transform-coded data; performing the inverse transform of the transform data thus produced, such that the real-domain data produced are in the form of high-precision numbers; converting the high-precision numbers to integers and clipping to an allowed range; subtracting the converted and clipped integers from the high-precision numbers forming high-precision differences; manipulating these converted and clipped integers; adding the manipulated integers to the high-precision differences to create manipulated high-precision numbers after the processing stage is complete.
Still another aspect of the present invention is to provide a method for processing transform-coded data in the real domain to produce new transform-coded data, which reduces the error produced by converting to integers and clipping to an allowed range after the inverse transform. In an embodiment, this method includes: performing the inverse quantization of the transform-coded data; performing the inverse transform of the transform data thus produced, such that the real-domain data produced are in the form of high-precision numbers; converting the high-precision numbers to integers and clipping to an allowed range; subtracting the converted and clipped integers from the high-precision numbers forming high-precision differences; processing these converted and clipped integers; adding the processed integers to the high-precision differences to form processed high-precision numbers after the processing stage is complete performing the forward transform on the processed high-precision numbers; and performing quantization on the new transform data. If the errors in the forward and inverse transforms and in the processing are sufficiently small, there will be no undesirable errors produced in the new quantized transform-domain data.
Still another aspect of the present invention is to provide a method for processing the high-precision differences in the real domain to produce new high-precision differences, which reduces the error produced by converting to integers and clipping to an allowed range after the inverse transform. In an embodiment, this method includes: performing the inverse transform of the transform data such that the real-domain data produced are in the form of high-precision numbers; converting the high-precision numbers to integers and clipping to an allowed range; subtracting the converted and clipped integers from the high-precision numbers forming high-precision differences; manipulating these converted and clipped integers; manipulating these high-precision differences; and adding the manipulated integers to the manipulated high-precision differences to create manipulated high-precision numbers after the processing stage is complete.
It is another aspect of this invention to provide a method for processing transform-coded data in the real domain which reduces the undesired errors in the data produced by the converting to integers and clipping to an allowed range after the inverse transform. In an embodiment, the method includes: performing the inverse quantization of the transform-coded data; performing the inverse transform of the transform data thus produced, such that the real-domain data produced are in the form of high-precision numbers; converting the high-precision numbers to integers and clipping to an allowed range; subtracting the converted and clipped integers from the high-precision numbers forming high-precision differences; manipulating these converted and clipped integers; manipulating these high-precision differences; adding the manipulated integers to the manipulated high-precision differences to create manipulated high-precision numbers after the processing stage is complete.
Still another aspect of the present invention is to provide a method for processing transform-coded data in the real domain to produce new transform-coded data, which reduces the error produced by converting to integers and clipping to an allowed range after the inverse transform. In an embodiment, this method includes: performing the inverse quantization of the transform-coded data; performing the inverse transform of the transform data thus produced, such that the real-domain data produced are in the form of high-precision numbers; converting the high-precision numbers to integers and clipping to an allowed range; subtracting the converted and clipped integers from the high-precision numbers forming high-precision differences; processing these converted and clipped integers; manipulating these high-precision differences; adding the processed integers to the manipulated high-precision differences to form processed high-precision numbers after the processing stage is complete performing the forward transform on the processed high-precision numbers; and performing quantization on the new transform data. If the errors in the forward and inverse transforms and in the processing are sufficiently small, there will be no undesirable errors produced in the new quantized transform-domain data.
Still another aspect of the present invention is to provide a method for selecting between the initial high-precision numbers in the real domain and the processed converted integer data, which reduces the error produced by converting to integers and clipping to an allowed range after the inverse transform. In an embodiment, this method includes: performing the inverse transform of the transform data such that the real-domain data produced are in the form of high-precision numbers; converting the high-precision numbers to integers and clipping to an allowed range; manipulating these converted and clipped integers; and selecting between the manipulated integers and the high-precision numbers to create manipulated high-precision numbers after the processing stage is complete.
Still another aspect of the present invention is to provide a method for selecting between the processed high-precision numbers in the real domain and the processed converted integer data, which reduces the error produced by converting to integers and clipping to an allowed range after the inverse transform. In an embodiment, this method includes: performing the inverse transform of the transform data such that the real-domain data produced are in the form of high-precision numbers; converting the high-precision numbers to integers and clipping to an allowed range; manipulating these converted and clipped integers; manipulating the high-precision numbers to form manipulated high-precision numbers; and selecting between the manipulated integers and the manipulated high-precision numbers to create manipulated high-precision numbers after the processing stage is complete.
There is no requirement that the input data to the methods described herein need come from a single data source. Thus, this invention is not restricted to the real-domain processing of data from a single source, but also applies to real-domain processing of data from multiple sources, such as the merging of images or audio data.
The quantization described in the background is the linear quantization used in international image data compression standards such as JPEG and MPEG. There is no requirement that the quantization be linear. Any mapping that reduces the number of transform data levels in a deterministic way can be used with this invention. The quantization step has been described mathematically with a division in Equation (1). Actual embodiments may use a lookup table or a sequence of comparisons to achieve similar results.
It is a further aspect of the invention to provide apparatus, a computer product and an article of manufacture comprising a computer usable medium having computer readable program code means embodied therein for causing a computer to perform the methods of the present invention.