1. Field of the Invention
The present invention relates to systems and methods for increasing compression of data streams containing image data, and in particular to systems and methods that increase compression by predicting values for certain low frequency coefficients of pixel block transforms.
2. Description of the Related Art
Many important image compression methods process images as independent blocks of pixels. For example, such families of compression standards as JPEG, MPEG, H.320, and so forth, specify a step involving discrete cosine transformation (xe2x80x9cDCTxe2x80x9d) of independent, non-overlapping 8xc3x978 blocks of pixels in the source image followed by quantization of the resulting transform coefficients. See, e.g., Jack, 1996, Video Demystified, HighText Interactive Inc., San Diego, Calif. The quantized transform coefficients are transmitted from a transmitter-encoder to a receiver-decoder.
Such transformation and quantization together achieve compression by exploiting the significant regularities and correlations that typically occur between the values of pixels in 8xc3x978 blocks. However, such methods ignore any regularities and correlations that may occur between pixels in different pixel blocks which are treated as independent by these methods.
Certain work has been reported which attempts to recognize image regularities at scales of a pixel block or larger. Exemplary of these are Pennebaker et al., 1993, JPEG Still Image Compression, Van Nostrand Reinhard, chap. 16, which discloses fitting quadratic surfaces to the average values of pixels (equivalent to the xe2x80x9cDCxe2x80x9d, or lowest order, transform coefficient) in adjacent blocks, a computationally complex process; Lakhani, 1996, xe2x80x9cImproved Image Reproduction from DC Componentsxe2x80x9d, Opt. Eng. 35:3449-2452, which discloses equations for predicting low frequency transform coefficients from DC coefficients that are improved from those in the JPEG standard; and Jeon et al., 1995, Blocking Artifacts Reduction in Image Coding Based on Minimum Block Boundary Discontinuity, Proc SPIE 2501:189-209, which discloses a complex and computationally expensive iterative method for interpolating pixels in order to zero block boundary discontinuities.
This reported work suffers from one or more problems, such as not being directed to maximally improving image compression, ignoring or at best inadequately treating regularities that may exist at scales in an image greater than a pixel block, requiring excessive computational resources, and so forth. What is needed, therefore, is a computationally efficient method and system directed primarily to achieving increased data compression by exploiting additional regularities and correlations in images not exploited by known compression methods and standards.
Citation of a reference herein, or throughout this specification, is not to construed as an admission that such reference is prior art to the Applicant""s invention of the invention subsequently claimed.
The objects of the present invention are to provide improvements generally applicable to certain types of encoders and decoders for image-containing data of all types which overcome the above identified problems in the current art.
Encoder/decoder pairs to which the present invention is applicable are those that, during image compression transform the image from the spatial domain, where the image is represented as a spatial array of pixels, to a transform domain, where the image is represented as coefficients of the basis functions used in the transform method, followed by quantization of the resulting transform coefficients. During decompression, the decoder reverses these steps. In particular, relevant encoder/decoder pairs divide the spatial domain image into a plurality of non-overlapping sub-blocks of pixels and perform the transformation/inverse transformation independently on each sub-block in the image.
In the relevant types of encoder/decoder pairs, the improvement of the present invention includes, in the encoder, an additional step which predicts certain low-order, or the low-frequency (xe2x80x9cLFxe2x80x9d), transform coefficients. The predicted LF coefficients are then subtracted from the actual LF transform coefficients to form LF difference coefficients, which are quantized and transmitted to the decoder. In the decoder, these steps are reversed, namely, the LF coefficients are again predicted, and the predicted coefficients are added to the transmitted quantized LF difference coefficients to arrive at the original LF coefficients, up to quantization errors.
The LF coefficient prediction according to the present invention is based on the fact that image-data compression, in addition to that already realized by sub-block transformation and quantization, can be achieved by capturing regularities and correlations between pixel values present in adjacent sub-blocks (xe2x80x9cinter-block regularitiesxe2x80x9d). These inter-block regularities can be advantageously exploited by predicting LF transform coefficients to be those that are necessary to smooth differences between adjacent sub-blocks so that the image is smooth at sub-block boundaries. In a preferred embodiment, the differences between adjacent sub-blocks are represented by differences in the average intensities of the adjacent sub-blocks.
In detail, these objects are achieved by the following embodiments of the present invention. In a first embodiment, the present invention includes a method for compressing an image presented as image data in the form of a pixel array comprising: transforming a plurality of pixel blocks to transform coefficients in a frequency domain, wherein the pixel blocks are rectangular, non-overlapping, and the plurality of pixel blocks covers the pixel array, and wherein the transform coefficients represent each pixel block and include a zero frequency transform coefficient, one or more selected low frequency transform coefficients, and remaining transform coefficients, predicting for each pixel block the selected low frequency transform coefficients from a linear combination of the zero frequency transform coefficient of the pixel block and of the zero frequency transform coefficients of pixel blocks orthogonally adjacent to the pixel block, subtracting for each pixel block the predicted selected low frequency transform coefficients from the selected low frequency transform coefficients to form difference transform coefficients, quantizing for each pixel block the difference coefficients and the remaining transform coefficients, and representing the image by compressed image data comprising the zero frequency transform coefficient, the quantized difference coefficients, and the quantized remaining coefficients for each of the plurality of pixel blocks.
In a first aspect of the first embodiment, the step of predicting for each pixel block further comprises: determining an interpolating pixel array having interpolating pixel values linearly interpolating differences between the zero frequency transform coefficient of the pixel block and the zero frequency transform coefficients of pixel blocks orthogonally adjacent to the pixel block, transforming the interpolating pixel array to transform coefficients in the frequency domain, and selecting the predicted selected low frequency transform coefficients as the corresponding transform coefficients of the interpolating pixel array.
In a second aspect of the first embodiment, the step of selecting selects the predicted selected low frequency transform coefficients as those transform coefficients present in an upper left square sub-array of size three-by-three of the transformed interpolating pixel array, excluding the zero frequency transform coefficient of the transformed interpolating pixel array. In a third aspect of the first embodiment, the pixel values of the interpolating pixel array are weighted sums of differences between the zero frequency transform coefficient of the pixel block and the zero frequency transform coefficient of each pixel block orthogonally adjacent to the pixel block, and the pixel values are linearly interpolated in a dimension-independent manner.
In a second embodiment, the present invention includes a computer readable media encoded with program instructions for causing one or more processors to perform the methods and aspects of the methods of the first embodiment.
In a third embodiment, the present invention includes a method for compressing an image presented as image data in the form of an pixel array comprising: transforming a plurality of pixel blocks to transform coefficients in a frequency domain, wherein the pixel blocks are rectangular, non-overlapping, and the plurality of pixel blocks covers the pixel array, and wherein the transform coefficients represent each pixel block and include a zero frequency transform coefficient, one or more selected low frequency transform coefficients, and remaining transform coefficients, predicting for each pixel block the selected low frequency transform coefficients from a linear combination of differences between pixels along the edges of the block and pixels along edges of pixel blocks orthogonally adjacent to the pixel block, subtracting for each pixel block the predicted selected low frequency transform coefficients from the selected low frequency transform coefficients to form difference transform coefficients, quantizing for each pixel block the difference coefficients and the remaining transform coefficients, and representing the image by compressed image data comprising the zero frequency transform coefficient, the quantized difference coefficients, and the quantized remaining coefficients for each of the plurality of pixel blocks.
In a first aspect of the third embodiment, the linear combination of differences comprises a linear combination of averages of all pixels along the edges of the block and averages of all pixels along edges of pixel blocks orthogonally adjacent to the pixel block.
In a fourth embodiment, the present invention includes a method for reconstructing an image presented in the form of compressed image data for a plurality of pixel blocks comprising: retrieving the compressed image data, wherein the compressed image data comprise for each pixel block a zero frequency transform coefficient, quantized difference transform coefficients, and quantized remaining transform coefficients, and wherein the pixel blocks are rectangular, non-overlapping, and the plurality of pixel blocks covers a pixel array representing the image, dequantizing the quantized difference transform coefficients and the quantized remaining transform coefficients to provide difference transform coefficients and remaining transform coefficients, respectively, predicting for each pixel block selected low frequency transform coefficients from a linear combination of the zero frequency transform coefficient of the pixel block and of the zero frequency transform coefficients of pixel blocks orthogonally adjacent to the pixel block, adding for each pixel block the predicted selected low frequency transform coefficients to the difference transform coefficients to form selected low frequency transform coefficients, inverse transforming for each pixel block the zero frequency transform coefficient, the selected low frequency transform coefficients, and the remaining transform coefficients, in order that the plurality of pixel blocks is reconstructed, and reconstructing the pixel array from the plurality of reconstructed pixel blocks.
In a first aspect of the fourth embodiment, the step of predicting for each pixel block further comprises: determining an interpolating pixel array having interpolating pixel values linearly interpolating differences between the zero frequency transform coefficient of the pixel block and the zero frequency transform coefficients of pixel blocks orthogonally adjacent to the pixel block, transforming the interpolating pixel array to transform coefficients in the frequency domain, and selecting the predicted selected low frequency transform coefficients as the corresponding transform coefficients of the interpolating pixel array.
In a fifth embodiment, the present invention includes a computer readable media encoded with program instructions for causing one or more processors to perform the methods and the aspects of the methods of the fourth embodiment.
In a sixth embodiment, the present invention includes a system for compressing an image presented as image data in the form of a pixel array comprising: means for transforming a plurality of pixel blocks to transform coefficients in a frequency domain, wherein the pixel blocks are rectangular, non-overlapping, and the plurality of pixel blocks covers the pixel array, and wherein the transform coefficients represent each pixel block and include a zero frequency transform coefficient, one or more selected low frequency transform coefficients, and remaining transform coefficients, means for predicting for each pixel block the selected low frequency transform coefficients from a linear combination of the zero frequency transform coefficient of the pixel block and of the zero frequency transform coefficients of pixel blocks orthogonally adjacent to the pixel block, means for subtracting for each pixel block the predicted selected low frequency transform coefficients from the selected low frequency transform coefficients to form difference transform coefficients, means for quantizing for each pixel block the difference coefficients and the remaining transform coefficients, and means for representing the image by compressed image data comprising the zero frequency transform coefficient, the quantized difference coefficients, and the quantized remaining coefficients for each of the plurality of pixel blocks.
In a seventh embodiment, the present invention includes a system for reconstructing an image presented in the form of compressed image data for a plurality of pixel blocks comprising: means for retrieving the compressed image data, wherein the compressed image data comprise for each pixel block a zero frequency transform coefficient, quantized difference transform coefficients, and quantized remaining transform coefficients, and wherein the pixel blocks are rectangular, non-overlapping, and the plurality of pixel blocks covers a pixel array representing the image, means for dequantizing the quantized difference transform coefficients and the quantized remaining transform coefficients to provide difference transform coefficients and remaining transform coefficients, respectively, means for predicting for each pixel block selected low frequency transform coefficients from a linear combination of the zero frequency transform coefficient of the pixel block and of the zero frequency transform coefficient s of pixel blocks orthogonally adjacent to the pixel block, mean s for adding for each pixel block the predicted selected low frequency transform coefficients to the difference transform coefficients to form selected low frequency transform coefficients, means for inverse transforming for each pixel block the zero frequency transform coefficient, the selected low frequency transform coefficients, and the remaining transform coefficients, in order that the plurality of pixel blocks is reconstructed, and means for reconstructing the pixel array from the plurality of reconstructed pixel blocks.
In an eighth embodiment, the present invention includes a system for compressing an image presented as image data in the form of pixel array data comprising: one or more processors for executing program instructions, and one or more memory units for storing the pixel array to be processed and program instructions, wherein said program instructions cause said one or more processors to transform a plurality of pixel blocks to transform coefficients in a frequency domain, wherein the pixel blocks are rectangular, non-overlapping, and the plurality of pixel blocks covers the pixel array, and wherein the transform coefficients represent each pixel block and include a zero frequency transform coefficient, one or more selected low frequency transform coefficients, and remaining transform coefficients, to predict for each pixel block the selected low frequency transform coefficients from a linear combination of the zero frequency transform coefficient of the pixel block and of the zero frequency transform coefficients of pixel blocks orthogonally adjacent to the pixel block, to subtract for each pixel block the predicted selected low frequency transform coefficients from the selected low frequency transform coefficients to form difference transform coefficients, to quantize for each pixel block the difference coefficients and the remaining transform coefficients, and to represent the image by compressed image data comprising the zero frequency transform coefficient, the quantized difference coefficients, and the quantized remaining coefficients for each of the plurality of pixel blocks.
In a ninth embodiment, the present invention includes a system for reconstructing an image presented in the form of compressed image data for a plurality of pixel blocks comprising: one or more processors for executing program instructions, and one or more memory units for storing the compressed image data to be processed and program instructions, wherein said program instructions cause said one or more processors to retrieve the compressed image data, wherein the compressed image data comprises for each pixel block a zero frequency transform coefficient, quantized difference transform coefficients, and quantized remaining transform coefficients, and wherein the pixel blocks are rectangular, non-overlapping, and the plurality of pixel blocks covers a pixel array representing the image, to dequantize the quantized difference transform coefficients and the quantized remaining transform coefficients to provide difference transform coefficients and remaining transform coefficients, respectively, to predict for each pixel block selected low frequency transform coefficients from a linear combination of the zero frequency transform coefficient of the pixel block and of the zero frequency transform coefficient s of pixel blocks orthogonally adjacent to the pixel block, to add for each pixel block the predicted selected low frequency transform coefficients to the difference transform coefficients to form selected low frequency transform coefficients, to inverse transform for each pixel block the zero frequency transform coefficient, the selected low frequency transform coefficients, and the remaining transform coefficients, in order that the plurality of pixel blocks is reconstructed, and to reconstruct the pixel array from the plurality of reconstructed pixel blocks.