The goal of data compression is to reduce the number of bits required to represent data or an image. The prior art is replete with many methods for data compression. Those methods which provide the highest levels of data compression generally require the most complex data processing equipment and are often slow in execution. Those methods which offer lower levels of data compression often operate faster and employ less complex hardware. In general, the choice of a data compression method is made based upon a compromise between system complexity and time of execution versus desired level of data compression.
The published prior art contains a number of techniques of data compression methods. "Coding of Two-Tone Images", Hwang, IEEE Transactions on Communications, Vol. COM-25, No. 11, November, 1977, pp. 1406-1424, is a review paper that describes a number of techniques for efficient coding of both alphanumeric data and image data. Both single dimension (run length) and two-dimension coding (e.g. per block of pixel data) are considered. Hunter et al. in "International Digital Facsimile Coding Standards", Proceedings of the IEEE, Vol. 68, No. 7, July, 1980, pp. 854-867 describe various algorithms used in facsimile transmission (generally one-dimension coding techniques). They also describe two-dimension coding schemes wherein conditions of a subsequent coding line are encoded in dependence upon conditions in a previous reference line.
In a paper entitled "An Extremely Fast Ziv-Lempel Data Compression Algorithm" by Williams, Proceedings of the IEEE Data Compression Conference, April, 1991, pp. 362-371, a fast implementation of the LempeI-Ziv (LZ) compression algorithm is described that employs the well-known LZ method. That method constructs a dictionary of data strings at both the receiving and transmitting nodes and transmits codes in dependence upon matches found between an input data string and a data string found in the dictionary.
Usubuchi et al. in "Adaptive Predictive Coding For Newspaper Facsimile", Proceedings of the IEEE, Vol. 68, No. 1980, pp. 807-813, describe an adaptive, predictive algorithm which is applied to compression of half-tone image data. A further predictive method of encoding of half-tone picture images is described by Stoffel in "Half-tone Pictorial Encoding", SPIE Applications of Digital Image Processing, Vol. 19, 1977, pp. 56-63. Stoffel's algorithm divides an image into blocks and tries to predict the current block from the previous block. The final coded image consists of prediction errors and block values.
In "Compression of Black-White Images with Arithmetic Coding" by Langdon, Jr. et al., IEEE Transactions on Communications, Vol. COM-29, No. 6, June, 1981, pp. 858-867, there is described an arithmetic coding method wherein a pixel by pixel probability is estimated based upon the pixel's context (i.e., surrounding pixels). The arithmetic code of Langdon, Jr. et al. avoids multiplication operations inherent in some earlier arithmetic codes. The Langdon, Jr. et al. compression technique is soon to be an international standard for coding of bi-level image data as indicated by Hampel et al., "Technical Features of the JBIG Standard for Progressive Bi-Level Image Compression", Signal Processing: Image Communication Journal, Vol. 4, No. 2 (1992), pp. 103-111.
Bentley et al. in "A Locally Adaptive Data Compression Scheme", Communications of the ACM, Apr. 8, 1986, Vol. 29, No. 4, pp. 320-330 and "Technical Correspondence", Communications of the ACM, September 1987, Vol. 30, No. 9, pp. 792, 793 describe methods for compression of textual data using a self-organizing sequential search technique. In specific, frequently accessed words are near the top of a search sequence so they are encountered early in the compression action.
The system described by Bentley et al. is similar to a cache memory in a central processing unit memory system. Specifically, a cache memory with least recently used (LRU) management is employed (the cache taking the form of a list of numbers). The list is ordered in a most recently to least recently used fashion. Every time a value or word is encountered, it is removed from the cache list and placed at the top of the cache list, with the rest of the values being "moved down", If a word is encountered which is not in the cache list, then the least recently used value is removed from the cache to make room for the new value (at the top).
The tables shown in FIG. 1, illustrate a sequence of actions in such a cache. Table 20 illustrates a four position cache wherein a most recently used value resides at the "top" of the cache and a least recently used value resides at the "bottom" of the cache, with other values residing in intermediate positions. Cache representations 22, 24 and 26 illustrate successive states of cache 20. Cache state 22 is the initial state of cache 20 and shows it storing four decimal values, with decimal value 10 being the most recently used value. Cache state 24 is the state of cache 20 after a value of 6 is encountered. Since the value 6 was already in cache 20, it is moved to the top of the cache and the values 10 and 5 are pushed down. Cache state 26 occurs after a value of 25 is encountered. Since that value is not in cache 20, the value is inserted at the top of cache 20 and the least recently used value (8) is removed and the other values pushed down.
Encoding and decoding processes employed for data compression and decompression manipulate compressed code words and cache adaptation so as to assure cache state synchronization in both encoder and decoder mechanisms. Cache synchronization assures lossless data handling. As shown in FIG. 1, the state of cache 20 is "adapted" as each new value is encountered. If a value is already in the cache of an encoder, then its cache position is transmitted in the form of a code word that indicates the position of the value. If the value is not in the cache, then a special code word is output by the encoder along with the value, per se. Compression is achieved because in general, the cache position can be transmitted with less bits than the value itself. The decoder interprets received code words and reconstructs the original data stream. The decoder further "adapts" its cache in the same way as the encoder, so as to remain synchronous therewith.
The procedure employed to assure synchronism of caches in both encoder and decoder mechanisms is illustrated in FIGS. 2 and 3. FIG. 2 illustrates the encoding procedure and FIG. 3 illustrates the decoding procedure. Referring to FIG. 2, an uncompressed data value (e.g. a byte) is accessed (box 30), and it is then determined whether that uncompressed data value matches a value in the compression cache (decision box 32). If yes, the position code indicating the position of the matching value in the cache is outputted (box 34), and the cache is adapted (box 36) by moving, if necessary, the matching cache value up to the top of the cache and accordingly rearranging remaining values. The procedure then recycles to the next uncompressed data value.
If no cache match is found for a received data value, a "not found" code word is issued (box 38), and the actual data value is also outputted (box 40). The received data value is then inserted in the top of the cache and the remainder of the cache is "adapted" by pushing down the remaining values and eliminating the least recently used value.
Upon decompression (FIG. 3), a compressed code word is accessed (box 44) and it is determined whether it contains a position code word (decision box 46). If yes, the decoder cache value at that position is outputted (box 48) and the cache is adapted by moving the outputted cache value up to the top of the cache (box 50).
If, by contrast, the received data is not a position code word, the data value received is accessed and outputted (box 52) and it is also inserted into the cache, at the top, and the remainder of the cache is adapted.
While a cache-based compression procedure such as that described above is efficient, the management of a least recently used cache is often computationally complex. Each time the cache is "adapted", a large number of values need to be rearranged. Such rearrangement, as it occurs many times during a compression sequence, can occupy substantial processing time and renders the compression procedure considerably less efficient.
It is therefore an object of this invention to render a cache based compression procedure more efficient through the use of improved cache management techniques.
It is still another object of this invention to provide an improved compression/decompression procedure that employs cache-based prediction techniques.
It is yet another object of this invention to provide a cache-based compression/decompression procedure that is particularly adapted to image processing and makes use of data content in the vicinity of a value to be compressed.