The present invention relates to image compression systems, and, more particularly, to methods and systems for quantizing coefficients of an image.
The field of data compression seeks to develop improved methods of compression. A variety of compression methods exist. In general, a compressor, or encoder, compresses raw data in an input stream and creates an output stream having compressed data. A decompressor, or decoder, converts the compressed data back to its original form. Certain compression methods are lossy. These methods, used for image compression, achieve better compression by losing some information from the input stream. When the compressed stream is decompressed, the converted result is not identical to the original data stream. If the loss of data is small, the perceptual difference of a compressed image is negligible.
Graphical images are present in many computer, internet and communication applications, but the image data size tends to be large. The image exists for human perception. When the image is compressed, however, it is not uncommon to lose image features for which the human eye is not sensitive. Thus, the eye cannot see any image degradation.
Because human visual perception is sensitive to some frequencies but not to others, known image compression methods want to remove those frequencies that are less sensitive to human visual perception. The discrete cosine transform (xe2x80x9cDCTxe2x80x9d) is a widely used transform operation in image compression. DCT is the preferred transform for lossy image compression methods. A 2-dimensional DCT is applied to 8 by 8 block arrays of pixels in an image. The transformed coefficients generated by the DCT operation are quantized to provide the actual compression. Most DCT coefficients are small and become zero after quantization.
Human visual perception is less sensitive to the high frequency components of an image represented by the higher DCT coefficients. After each 8xc3x978 block array of DCT coefficients is calculated, it is quantized. This is the step wherein information loss occurs. Each number in the DCT coefficients""matrix is divided by the corresponding number from a particular quantization matrix. The quantized result is rounded to the nearest integer. A large quantization factor usually is applied to higher frequency components within the DCT coefficients""matrix. Thus, a known quantization matrix has higher quantization factors for higher frequency DCT coefficients.
There is trade-off between image quality and the degree of quantization. A large quantization step size may produce unacceptably large image distortion. Finer quantization, however, leads to lower compression ratios. Thus, known image compression methods seek to quantize DCT coefficients in an efficient manner to provide a compressed image with little image distortion perceptive to the human eye. Because of the human eyesight""s natural high frequency roll-off, high frequencies play a less important role in known image compression methods than low frequencies.
A quantization matrix is an 8xc3x978 matrix of step sizes, sometimes called quantums, one element for each DCT coefficient. A quantization matrix usually is symmetric. Step sizes may be small in the upper left corner of the matrix, which correlates to low frequencies. Step sizes may be large in the lower right corner of the matrix, which represents high frequencies. A step size of 1 is the most precise. A quantizer divides the DCT coefficient by its corresponding quantum, then rounds to the nearest integer. Large quantums drive small coefficients down to zero. This results in many high frequency coefficients becoming zero, and, therefore, an image is easier code. The low frequency coefficients undergo only minor adjustments. Many zeros among the high frequency coefficients result in an efficient compression. Thus, known image compression methods may use a higher quantum for the high frequency coefficients with little noticeable image deterioration.
Quantization matrices, or tables, may be generated by default or computed. Further, it may be desirable to quantize the DCT coefficients by some metric that optimizes image compression by losing coefficients not sensitive to the human visual system. Known methods compare previous image coefficients with an instant image coefficient to locate similar blocks and to find a best match. Differences between the previous image and the instant image are determined and then transformed into DCT coefficients. Other known methods use tiling or flipping of the DCT coefficients in order to generate larger size DCT coefficient images. Other known methods perform a preliminary investigation of the actual response of the human visual system to different DCT coefficients to determine a model metric the quantization matrix partitioning.
Human visual systems, however, are not sensitive to particular frequencies or spatial differences within all images. In other words, a human visual system may be less sensitive to a particular DCT coefficient in one context and more sensitive in another context. These human visual system metrics may not apply in all instances of image compression. Thus, different quantization matrices may be desired to compress a plurality of images. Current methods of image compression are time-consuming and inefficient for determining and optimizing quantization matrices.
From the foregoing, it may be appreciated that a need had arisen for a system and method for generating image compression quantization matrices within an image compression system. In accordance with one embodiment of the present invention, a system and method for generating image compression quantization matrices is provided that substantially eliminates and reduces the disadvantages and problems associated with conventional image compression systems.
In accordance with one embodiment of the present invention, a method for generating quantization matrices is provided. The method includes the step of retrieving a model metric. The method also includes the step of determining a quantization coefficient matrix. The method also includes partitioning the quantization coefficient matrix with the model metric to generate a quantization matrix.
In accordance with another embodiment of the present invention, a method for compressing an image having pixel blocks is provided. The method includes the step of capturing the image having the blocks. The method also includes the step of transforming the blocks to a discrete cosine transform blocks array. The method also includes retrieving a model metric. The method also includes determining a quantization coefficient matrix. The method also includes partitioning the quantization coefficient matrix with the model metric to generate a quantization matrix. The method also includes quantizing the discrete cosine transform block array with the quantization matrix. The method also includes encoding the quantized transforms.
In accordance with another embodiment of the present invention, a system for generating image compression quantization matrices is provided. The compression system includes an image comprised of pixel blocks. The system also includes a discrete cosine transform array comprised of the discrete cosine transforms of the pixel blocks. This system includes a quantization coefficient matrix. This system also includes an image distortion model metric. This system also includes a quantizer that partitions the quantization coefficient matrix with the model metric and generates a quantization matrix. The quantizer also quantizes the discrete cosine transform array with the quantization matrix. The system also includes an encoder that encodes the quantized transforms. The system may include a decoder that decodes a received signal from the encoder and uncompresses the quantized transformed array.
Thus, the present invention obtains phase-reliable discrete cosine transform basis functions and uses them to partition a quantization matrix using channel image distortion matrix. The coefficients in the quantization coefficient matrix correlate to frequency bands in orientations in the channel image distortion metric. The partition is then used for vision optimized encoding.
The channel image distortion model metric produces image distortion metric values for vertical and horizontal orientations. For each orientation, image distortion metric values are generated for three frequency bands: low, mid, and high.
A technical advantage of the present invention is that it treats the function of transforming an image for encoding as a continuous function. Thus, video as well as still, images may be compressed. Another technical advantage of the present invention is that expansion to a larger image is optimized both conceptually and in implementation. Another technical advantage of the present invention is that the compression system does not rely on a human visual system. Another technical advantage of the present invention is that the compression procedure is automated by using the image distortion model metric followed by a partitioning step. Another technical advantage of the present invention is that the compression and encoding of an image has increased reliability and consistency.