This invention relates to creating a compressed electronic image using a dictionary or table based variable lossy compression algorithm.
Electronic images are often transmitted from one user to another over a data network. The speed of transmission depends in part on the size of the image being transmitted and can be reduced by reducing the image's size. One way to reduce an image's size is to compress it. A common algorithm for compressing an image, referred to as a dictionary or table based compression algorithm, works by storing a dictionary or table of all of the unique strings that occur in an image together with references to where those strings occur in the image.
A common dictionary based compression algorithm is the Lempel-Ziv-Welch (LZW) algorithm. The LZW algorithm is used to compress and store images in the Graphical Interface Format (GIF), and works by comparing strings of bytes in a file undergoing compression to strings of bytes in the file that have already been processed. When a unique string of bytes is found, that string is added to a compression table in an associated compressed file together with a much smaller string identifier that uniquely identifies it. When a non-unique string of bytes is found, i.e. when a string of bytes is found which already exists in the compressed file's compression table, a new byte from the file undergoing compression is added to it. The elongated string is then compared to the existing byte strings in the compression table, and is either added to the compression table together with a unique string identifier if it does not already exist in the compression table, or is further elongated with another byte from the file undergoing compression if it already exists in the compression table. The LZW compression algorithm is a lossless compression algorithm since no data is lost when a file is compressed.
The LZW compression algorithm can be modified to provide a lossy mode of operation in which two or more different byte strings in a file undergoing compression are represented by the same string identifier in the compression table of a corresponding compressed file. The modified, or lossy-LZW algorithm operates by comparing a byte string in a file undergoing compression to byte strings that are stored in the file's compression table, and determining whether the two byte strings are sufficiently different to warrant storing the byte string together with a new string identifier in the compression table. If the byte string is not sufficiently different from a byte string already stored in the compression table, it is represented by the same string identifier representing the existing compression table byte string. As a result, the information content of byte strings that are different from, but not sufficiently different from previously stored compression table byte strings is lost in the compressed file.
User can currently control the amount of loss that can occur in an image undergoing lossy compression or conversion to e.g. a GIF format on a per file basis by specifying the extent to which a byte string in a source image must be different from a byte string stored in the image's compression table to warrant adding the byte string and a string identifier to the compression table. The more the byte string is allowed to differ from byte strings in the compression table, the greater the allowed information loss in the compressed image. While users can therefore currently control the amount of information loss to an image undergoing e.g. GIF compression on a per file basis, they cannot currently control the amount of information loss on a regional basis within a file undergoing compression or conversion.