1. Technical Field
This invention generally relates to compressing digital images. More particularly, this invention relates to enhancing dictionary-based image compression techniques, such as the well known Lempel-Ziv algorithms (including LZ77, LZ78, and LZW), in such a way as to increase the compressibility of images while introducing minimal visual distortion.
2. Related Art
Digital image data can be large and expensive to transport and/or store. In order to transmit fewer bytes when transporting a digital image (e.g., over the Web), compression techniques may be used. Compression may be either lossy or lossless. Lossy compression results in an image that is not identical to the original, but resembles the original closely. Lossless compression takes advantage of the statistical redundancy in images to create an image that exactly represents the original, but uses less data.
A methodical understanding of compression (also known as source coding) began with the seminal work of Claude Shannon (1948), in which he laid out the foundations of most of Information Theory. A powerful class of compression techniques, known as dictionary-based techniques, were first described in the work of Ziv and Lempel (1977 and 1978) and later extended by Welch in 1984. These techniques, known commonly as LZ77, LZ78 and LZW, are simple to implement and relatively fast, and they achieve fairly high compression rates. Because of these properties, these techniques have been used in many computer applications.
LZ77 forms the underlying compression technology for the computer programs gzip, zip, PKZip, deflate, and zlib. In addition, LZ77 forms the underlying compression layer used in the PNG graphics format. LZW forms the underlying compression technology for the computer program compress. In addition, LZW forms the underlying compression layer used in the GIF graphics format.
Dictionary-based techniques are based on the assumption that, within a particular data set, groups of values will tend to be repeated. One of the groundbreaking facts proven by Ziv and Lempel (1977) is that for a stationary distribution—data that is generated by the same unchanging process—dictionary-based techniques approach the entropy of the system, and thus achieve the maximum possible compression ratio. This theoretical result only guarantees that maximum compression will be achieved if the data to be compressed is infinitely large. In practice, this is clearly not the case (all data must be finite) as data is limited by many constraints, including storage, memory, and bandwidth. Therefore, the rate at which the technique approaches the entropy of the data is of critical importance.
In practice, the entropy of a dataset will be lower, and dictionary-based compression will usually approach that entropy faster if the data values are chosen from a smaller alphabet (range of values). Because of this, both GIF and PNG formats reduce the number of colors that can be represented within the image, thereby reducing the alphabet of the data to be compressed. Both GIF and PNG formats have an upper bound of 256 colors. With more colors, the LZ77 and LZW techniques require more data than is present in a typical image to achieve reasonable compression ratios.
The process of GIF or PNG encoding an image can be described at a high level by the following steps: reduce the image to 256 or fewer colors; represent the image as a look up table (LUT) of colors and a two-dimensional array of color look up values (color values); store the size of the image; store the look up table; and compress and store the color values using LZW or LZ77.
With these techniques, GIF and PNG encoding techniques can achieve compression ratios between roughly a factor of 2 and a factor of 50.
To further increase the compressibility of images, two techniques are known in the prior art: image resizing and color reduction.
Image resizing is the simple operation of reducing the size of the image and thereby reducing the number of pixels in the image. Image resizing can be done in a number of ways: Subsampling—simply keeping some pixels and throwing away others; Bilinear interpolation—replacing a group of pixels with the weighted average value of the colors in that group; Spline interpolation—replacing a group of pixels with the weighted average value of the colors in that group and in the surrounding region, taking into account the smoothness of the color variation in the original image; Filtering—replacing a group of pixels with a sum value of the colors in that group and in the surrounding region weighted by the values of the particular filter used. Often a Gaussian filter is used in this application.
By reducing the size of the image, fewer pixel values need to be encoded, resulting in a smaller compressed image. However, by reducing the size of the image, image detail is lost, potentially including critical image characteristics.
Because images that are best stored as GIFs or PNGs are often detail-oriented (e.g., icons, diagrams, or drawings), the detail lost due to image resizing is often unacceptable, even for moderate size reductions (e.g., reductions of 30% or less).
Color reduction is the operation of representing the image using fewer colors. Color reduction is already performed in GIF and PNG compression when the image is reduced to 256 colors, typically from a potential set of 16 million colors.
Color reduction improves the compressibility of images by reducing the size of the alphabet—the number of allowable values—that each pixel can take on. Smaller alphabets typically result in longer sets of repeated values in the data, and thus higher compressibility.
Up to a point, color reduction techniques can be quite effective at reducing the size of a compressed image while creating very little visible distortion of the image. However, as the number of colors is decreased, visible distortion increases, regardless of the reduction method used.
To compensate for decreased colors, techniques such as dithering and error diffusion can be used. By filling image regions with patterns of pixels of differing color, the appearance of a larger color set can be approximated. However, the applicability of dithering or error diffusion in a dictionary-based compression scheme is limited, as the patterns of mixed colors tend not to repeat, yielding short dictionary entries and limited compressibility.