1. Field of the Invention
The invention is directed to a method of decompressing images compressed in accordance with the Joint Photographic Expert Group (JPEG) proposed adaptive discrete cosine transform (ADCT) standard. More particularly, the invention is directed to a method of reducing decompression artifacts in document-type images resulting from decompression of standard JPEG ADCT compressed images.
2. Description of Related Art
Data compression is required in data handling processes when too much data is present for practical applications. Commonly, data compression is used in communication links, where the time to transmit is long or where bandwidth is limited. Another use for data compression is in data storage to substantially reduce the amount of media space required to store data. Data compression is also used in digital copiers, which require intermediate storage for copier functions such as collation and reprinting. Generally, scanned images, i.e., electronic representations of hard copy documents, require large amounts of data and thus are desirable candidates for data compression.
A number of different compression techniques exist, and many of these are proprietary to individual users. However, standards among the various techniques are desirable to enable communication between data handling devices. With the advent of multimedia communication, formerly dissimilar devices are and will be required to communicate. For example, it is desirable to enable facsimile machines to directly communicate with printers. Presently, compression standards are generally distinct for different data handling devices such as these. Thus, common data compression standards are needed.
Three major schemes for image compression are currently being studied by international standardization groups. A first scheme, for facsimile type image transmission that is primarily binary, is under study by the JBIG (Joint Binary Image Group) committee. A second scheme is being developed for television and film by the MPEG (Motion Pictures Expert Group.) For non-moving images, i.e., still images that are more general than those covered by JBIG, the JPEG (Joint Photographic Expert Group) is developing a device independent compression standard, which uses an adaptive discrete cosine transform (ADCT) scheme.
ADCT is described for example by W. H. Chen and C. H. Smith in "Adaptive Coding of Monochrome and Color Images", IEEE Trans. Comm., Vol. COM-25, pp. 1285-1292, November 1977. ADCT is the method disseminated by the JPEG committee and is a lossy system that reduces data redundancies based on pixel to pixel correlations. Generally, an image does not change significantly on a pixel to pixel basis. Further, it is presumed that an image has a "natural spatial correlation." In natural scenes, correlation is generalized but not exact because noise makes each pixel differ somewhat from neighboring pixels.
Typically, as shown in FIG. 1, the process of data compression utilizes a tile memory 10 storing an M.times.M tile of the image. For illustrative purposes, square tiles based on the JPEG recommendations are used in this description, but the inventive method can be performed with any form of tiling. From the portion of the image stored in tile memory 10, a frequency space representation of the image is formed at a transformer 12 using the discrete cosine transform (DCT.) For implementation, hardware is used such as the C-Cube Microsystems CL550A JPEG image compression processor, which operates in either the compression or the decompression mode according to the proposed JPEG standard. A divisor/quantization device 14 obtains a quantized DCT value from a set of values known as a Q-Table, stored in a Q-Table memory 16, by dividing a distinct Q-Table value into the DCT value. The integer portion of the divided value is returned as the quantized DCT value. A Huffman encoder 20 then statistically encodes the quantized DCT values to generate a compressed image that is output for storage, transmission, or the like.
The current ADCT compression method divides an image into M .times.M non-overlapping pixel blocks, where M=8. The selection of M=8 is a compromise, in which the larger the given block results in a higher obtainable compression ratio. However, a larger block is more likely to have noncorrelated pixels within the block, thereby reducing the compression ratio. If the block is smaller, greater pixel correlation within the block is possible but with less overall data compression. Within a document image particularly, edges of the image are more likely to be encountered within an 8.times.8 block than in a scene forming a natural image. Thus, for document images, the assumption of a natural spatial correlation fails to some extent. Therefore, although the assumptions of the ADCT proposal work well for photographs containing continuous tones and many levels of gray pixels, these assumptions often work poorly for the reproduction of document images, which have significant high frequency components and many high contrast edges.
Compression schemes commonly use a set of basis functions to utilize intra-block correlations. Basis functions define image data as a projection onto a set of orthogonal functions on an interval. ADCT uses cosine functions as the basis functions and DCT as the projection step. In the first step of the ADCT standard, the image is tiled into 8.times.8 blocks. Within each block, a set of 64 DCT coefficients is determined for the pixels in the block. The DCT coefficients represent the coefficients of each cosine term of the discrete cosine transform of the 8.times.8 block.
Referring now to FIG. 2A, an array of 64 gray level values representing 64 pixels in an 8.times.8 block of the image is shown. This 8.times.8 block is transformed according to the JPEG ADCT specifications resulting in the DCT coefficients shown in FIG. 2B. These coefficients still completely describe the image data of FIG. 2A, but, in general, larger values cluster at the top left corner in the low spatial frequency region. Simultaneously, the coefficient values in the lower right hand portion of the grid tend towards zero. This clustering occurs in the vast majority of images as the frequency of the image increases.
Generally, the human eye sees low frequencies in an image best. At higher frequencies, changes from amplitude to amplitude are unnoticeable, unless such changes occur at extremely high contrast. This is a well known effect of the human visual system and extensively documented, see e.g. "Visual Performance and Image Coding" by P. Roetling, Proceedings of the S.I.D. 17/2 pp. 111-114 (1976). The ADCT method uses the fact that small amplitude changes at high frequencies are unnoticeable and therefore can be generally ignored.
The next step in the ADCT method uses a quantization or Q-matrix. The Q-matrix shown in FIG. 2C is a standard JPEG-suggested matrix for compression, but ADCT, as well as the method claimed herein, can also operate using other Q-matrices (or Q-Tables.) The matrix incorporates the effect that low frequencies are roughly more important than high frequencies by introducing larger quantization steps, i.e. larger entries for larger frequencies. However, the table also attempts to internally construct some desirable variations from the general assumption. Accordingly, the values in the table vary with frequency. The exact perceived variation would be a function of the human visual system corresponding to the document type expected, i.e. photo, text or graphic, or of some other application dependent parameter. Each of the DCT values from FIG. 28 is divided by a corresponding Q-matrix value from FIG. 2C resulting in quantized DCT (QDCT) values by using the following relationship: EQU QDCT[m][n]=INT{DCT[m][n].div.Q-Table[m][n]+1/2}
where INT{} denotes the integer part of the function.
The term division used herein describes the process detailed in ADCT including the methods for handling truncation and round-off.
The quantized DCT values are shown in FIG. 2D. The remainder from the division process is discarded, resulting in a loss of data. Furthermore, since the Q values in the lower right hand portion of the table in FIG. 2C tend to be high, most of the values in that area go to zero as shown in FIG. 2D, unless there were extremely high amplitudes of the image at the higher frequencies.
After deriving the quantized set of DCT values shown in FIG. 2D, pixels are arranged in the order of a space filling zigzag curve. A statistical encoding method, such as the Huffman process, is used to generate the signal to be transmitted. This statistical coding is performed in a lossless way, and the only loss introduced in the compression is the one generated by the quantization of the DCT coefficients using the Q-Table.
ADCT transforms are well known, and existing hardware that performs the transform on image data is shown for example in U.S. Pat. No. 5,049,991 to Nihara, U.S. Pat. No. 5,001,559 to Gonzales et al., and U.S. Pat. No. 4,999,705 to Puri. The primary thrust of these particular patents, however, is moving picture images, not document images.
To decompress the now-compressed image, a series of functions or steps are followed reverse of the compression as shown in FIG. 1. The Huffman encoding is removed at decoder 50. The image signal then represents the quantized DCT coefficients, which are multiplied at signal multiplier 52 by the Q-Table values in memory 54 in a process inverse to the compression process. At inverse transformer 56, the inverse transform of the discrete cosine transform is derived, and the output image in the spatial domain is stored at image buffer 58.
In the described decompression method, Huffman encoding is removed to obtain the quantized DCT coefficient set. From the quantized DCT coefficients and the Q-Table, the quantization interval, i.e., the DCT constraints, are determined. Specifically, the upper and lower boundaries are determined by the relationships: EQU lower boundDCT[m][n]=(QDCT[m][n]-0.5).times.Q-Table [m][n];
and EQU upper boundDCT[m][n]=(QDCT[m][n]+0.5).times.Q-Table[m][n].
The center value of each range of upper and lower boundaries is used to calculate the DCT coefficient. In other words, each member of the set is multiplied by a Q-Table value resulting in the DCT coefficients shown in FIG. 3A by using the data of FIGS. 2C and 2D in the following relationship : EQU DCT[m][n]=QDCT[m][n].times.Q-Table[m][n].
However, the result shown in FIG. 3A is not the original set of DCT coefficients shown in FIG. 2B because the remainders calculated for the original quantization of the DCT coefficients with the Q-Table in the compression process have been lost. In a standard ADCT decompression process, the inverse discrete cosine transform of the set of DCT coefficients is derived to obtain image values shown in FIG. 3B.
The above described process is unable to reproduce an extremely accurate image. The original image cannot be reproduced since data within the image was discarded in the compression-quantization step. Poor reproduction is noted wherever strong edges appear, which are commonly present in text. In particular, "ringing artifacts," also called "mosquito noise," are apparent at the strong edges. These problems occur in text, graphics and halftones, components very common in document images. In addition to mosquito noise or ringing artifacts, a blocking artifact often appears in image areas having slowly varying grays, where each M.times.M block that formed the calculation of the compression basis is visible. Both artifacts detract from the accuracy and quality of the reproduction.
To remove such artifacts, two methods have been proposed. In a first method, the decompressed image is post-processed, i.e., after the image has been fully decompressed, an attempt is made to improve the image. Of course, such post-processing can never retrieve the original image, because that image has been lost. Moreover, post-processing of the image leads to a reconstructed image, which differs from the real source image, and subsequent compression/decompression steps possible in electronic imaging applications will lead to potentially larger and larger deviations between the reconstructed and original image. Such a process is demonstrated in the article "Reduction of Blocking Effects in Image Coding" by Reeve, III et al., Optical Engineering, January/February, 1984, Vol. 23, No. 1, p. 34, and "Linear Filtering for Reducing Blocking Effects in Orthogonal Transform Image Coding", by C. Avril et al., Journal of Electronic Imaging, April 1992, Vol. 1(2), pp. 183-191.
The second method employs an iterative decoding process using the known bandlimit of the data. In this method using the compressed form of the image again, different blocks, perhaps 32.times.32, are used to decode the image. In one example, a method of blurring the overall image was considered with the goal that such blurring would tend to smooth out the blocking artifacts. See "Iterative Procedures for Reduction of Blocking Effects in Transform Image Coding", by Rozenholtz et al., SPIE, Vol. 1452, Image Processing Algorithms and Techniques II, (1991), pp. 116-126.
All of the references cited above are incorporated herein by reference for their teachings.