This invention relates generally to image compression, and more particularly the invention relates to discrete cosine transform (DCT) based compression and coding of images.
Image compression is used in reducing large volumes of data in digitized images for convenient and economical storage and for transmission across communication networks having limited bandwidth. Image compression technology is important in digital still video cameras, color scanners, color printers, color fax machines, computers, and multimedia.
The joint photographic experts group (JPEG) has established a color image data compression standard for use in a variety of still image applications. Compression employs DCT-based processes operating on discrete blocks of the image. The DCT coefficients are then quantized based on measurements of the threshold for visibility. For coding, an 8.times.8 pixel array of DCT coefficient is reorganized into a one dimensional list using a zigzag sequence which tends to concentrate coefficients expressing the lowest spatial frequencies at lower indices with the DC component being number 0 in the zigzag. The quantized coefficients are then encoded using a Huffman coder. Finally, headers and markers are inserted in the codes for individual blocks along with bit and byte stuffings for JPEG data compatibility. FIG. 1 illustrates the JPEG compression algorithm.
The compressed data can then be stored (as in an electronic still camera or computer memory) or transmitted efficiently over a limited bandwidth communication network. Reconstruction of the image requires a reverse process in which the headers and markers are extracted, the Huffman code is decoded, coefficients are dequantized, and an inverse DCT (IDCT) operation is performed on the coefficients.
Zoran Corporation (assignee herein) has developed a chip for image compression. The chip employs an algorithm for high quality compression of continuous tone color or monochrome images based on the JPEG standard. The chip reduces the large data size required to store and transmit digital images by removing redundancies in the image data while maintaining the ability to reconstruct a high quality image. For example, in digital still video cameras, the chip enables the use of 1M byte solid state memory card instead of a 24M byte hard disk to store 24 768.times.480 pixel images. The chip also reduces the time required to transmit a 768.times.480 pixel image over a standard 9600 bits per second telephone line from 15 minutes to 40 seconds. The chip has been optimized for use in digital still video cameras, color video printers, fixed bit rate image transmission devices, security systems, and call sensitive image compression systems.
Bit rate control (BRC) is utilized in compressing the image into the predetermined file size. To execute the bit rate control algorithm, the coder performs two-pass compression including a statistical pass (Pass 1) throughout the image prior to the actual compression pass (Pass 2). The activity of the image and of every block are computed during the statistical first pass through the image. A scale factor for the quantization mechanism is computed according to the image activity, and the code volume of each image block is limited based on the activity of that block. Remainder bits are transferred to the allocation for the next block to improve the target compressed image size utilization.
The quantization of the coefficients is done by using quantization tables. The compression ratio is controlled by uniformly scaling the quantization tables with a scale factor. A large scale factor results in a high compression ratio and vice versa. The mechanism to determine the scale factor is by using the two passes: the first pass through the image is done with an initial scale factor (ISF). The quantization tables are scaled with the initial scale factor. Code volume needed for encoding the quantized DCT coefficients is accumulated during this pass using the Huffman tables (ACV data). This code volume is then used as an activity measure of the image. A new scale factor (NSF) is calculated from the TCV data and ACV data and from the initial scale factor by: ##EQU1##
The new scale factor (NSF) is then used for the second pass in which the actual encoding is done.
The total code volume is controlled by limiting the code volume of each block. Each block is allocated a code volume (ABCV) with the block allocation depending on an allocation factor (AF) and on an activity measure of that block.
For example, the allocation factor can be computed as the ratio between the target code volume (TCV) and the image activity which was accumulated in the statistical pass using the DCT coefficients (ACT). The activity of each block is measured by its block activity (BACT). ACT and AF are calculated as follows: ##EQU2##
The allocated code volume for each block is the product of the block activity and the allocation factor (AF.times.BACT).
The above algorithm (1) assumes a linear relation between log(ACV) and log(SF), where the slope of the line is assumed to be -2/3. Thus, in the first pass, the second parameter of the line is computed, from which the NSF is calculated in equation (1). However, the assumption of the line slope rarely matches the real situation. Although the straight line assumption holds for a wide range of SF values, the slope of the line depends on-the image, and it ranges between -0.5 and -1.0. This mismatch may cause big deviations in the resultant code volume (deviations of more than 60% have been measured), thus, severely reducing the coder performance.
The present invention is directed to an improved method and apparatus for image compression by providing a more accurate value for the new scale factor based on an informed definition of the relationship between Accumulated Code Volume (ACV) and Scale Factor (SF).