The disclosure in the appendix of this patent disclosure of this patent document contains material to which a claim of copyright protection is made. The copyright owner has no objection to the facsimile reproduction of any one of the patent documents or the patent disclosure, as it appears in the U.S. Patent and Trademark Office patent file or records, but reserves all other rights whatsoever.
The invention relates generally to the field of digital image compression and, more particularly, to such compression that recognizes text and uses a predetermined strategy for compressing and decompressing the text.
It is well known to employ transform coding of digital images for bandwidth compression prior to storage or transmission over a limited bandwidth communication channel. In a typical prior art digital image compression/decompression system employing transform coding, such as the one used in the JPEG International standard (Digital compression and coding of continuous-tone still images-Part I: Requirements and Guidelines (JPEG), ISO/IEC International Standard 10918-1, ITU-T Rec. T.81, 1993, or W. B. Pennebaker and J. L. Mitchell, JPEG Still Image Data Compression Standard, Van Nostrand Reinhold, New York, 1993), the digital image is formatted into blocks (e.g. 8xc3x978 pixels) and a linear transformation such as a discrete cosine transform (DCT) is applied to each block to generate 8xc3x978 blocks of transform coefficients. Theoretically, for images modeled as a first-order Markov source with a correlation coefficient close to unity, the DCT is very close to the Karhunen-Loeve (KL) Transform that is optimal in reducing correlation. For photographic type images, the DCT still provides very good decorrelation properties and, unlike the KL transform that is image-dependent and does not yield itself to straightforward computation, it can be efficiently implemented in software or hardware. The DCT coefficients are then normalized and quantized using a uniform quantizer.
In the JPEG standard, the user can specify a different quantizer step size for each coefficient. This allows the user to control the resulting distortion due to quantization in each coefficient. The quantizer step sizes may be designed based on the relative perceptual importance of the various DCT coefficients or according to other criteria depending on the application. The 64 quantizer step sizes corresponding to the 64 DCT coefficients in each 8xc3x978 block are specified by the 1-byte elements of an 8xc3x978 user-defined array, called the normalization matrix, or the quantization matrix, or the xe2x80x9cQ-tablexe2x80x9d. Each block of quantized transform coefficients is ordered into a one-dimensional vector using a zig-zag scan that rearranges the quantized coefficients in the order of decreasing energy. This usually results in long runs of zero quantized values that can be efficiently encoded by runlength coding. Each nonzero quantized value and the number of zero values preceding it are encoded as a {runlength/amplitude} pair using a minimum redundancy coding scheme such as Huffman coding. The binary coded transform coefficients along with an image header containing information such as the Q-table specification, the Huffman table specification, and other image related data are either stored or transmitted over a limited bandwidth channel.
At the receiver, the image signal is decoded from the binary bitstream using operations that are the inverse of those employed at the encoder. This technique is capable of producing advantageously high image compression ratios, thereby saving significant storage space or enabling low bit rate transmission of digital images over limited bandwidth communication channels.
Although the presently known and utilized method and system for compressing images are satisfactory, improvements are always desirable. In particular, when the system described above is used to compress an image that contains both photographic data and text, undesirable artifacts such as ringing and blocking can occur. This is due to the fact that text contains large intensity variations from one pixel to another that does not compress well with DCT-based schemes.
The present invention is directed at overcoming one or more of the problems set forth above. Briefly summarized, according to one aspect of the present invention, the invention resides in a method for compressing a digital image having both saturated text and/or line art and pictorial imagery, the method comprising the steps of: receiving the digital image as blocks of pixels; analyzing the block content to determine if all or any combination of the saturated text and/or line art, pictorial imagery or other types of image data are present; assigning the saturated text and/or line art, the pictorial imagery, the other types of image data to one of a plurality of categories; compressing the block of saturated text and/or line art, if any, according to a first predetermined compression method; compressing the pictorial imagery, if any, according to a second predetermined compression method; compressing the other types of imagery, if any, according to other predetermined compression methods; and providing means of conveying the image categories to a decoder.
It is an object of the present invention that, knowing a priori that a given block contains saturated text only, the compression scheme is modified in such a way as to take advantage of this information without a significant modification in the software or hardware implementation. The present invention utilizes knowledge of the type of image to be compressed in the compression process for producing higher quality images.
These and other aspects, objects, features and advantages of the present invention will be more clearly understood and appreciated from a review of the following detailed description of the preferred embodiments and appended claims, and by reference to the accompanying drawings.
The present invention has the advantage of compressing text and/or line art so that the sharp boundary information is preserved while simultaneously achieving a higher compression of the image data than is otherwise possible, using a standard JPEG compression technique.