1. Field of the Invention
The invention relates to LZ data compression and decompression systems particularly with respect to the LZW compression and decompression methodologies. More particularly, the invention relates to a decompressor utilizing an architecture similar to the character table architecture of the data compressor disclosed in said U.S. Pat. No. 6,426,711.
2. Description of the Prior Art
Professors Abraham Lempel and Jacob Ziv provided the theoretical basis for LZ data compression and decompression systems that are in present day widespread usage. Two of their seminal papers appear in the IEEE Transactions on Information Theory, IT-23-3, May 1977, pp. 337-343 and in the IEEE Transactions on Information Theory, IT-24-5, September 1978, pp. 530-536. A ubiquitously used data compression and decompression system known as LZW, adopted as the standard for V.42 bis modem compression and decompression, is described in U.S. Pat. No. 4,558,302 by Welch, issued Dec. 10, 1985. LZW has been adopted as the compression and decompression standard used in the GIF image communication protocol and is utilized in the TIFF image communication protocol. GIF is a development of CompuServe Incorporated and the name GIF is a Service Mark thereof. A reference to the GIF specification is found in GRAPHICS INTERCHANGE FORMAT, Version 89a, Jul. 31, 1990. TIFF is a development of Aldus Corporation and the name TIFF is a Trademark thereof. Reference to the TIFF specification is found in TIFF, Revision 6.0, Finalxe2x80x94Jun. 3, 1992.
Further examples of LZ dictionary based compression and decompression systems are described In the following U.S. patents: U.S. Pat. No. 4,464,650 by Eastman et al., issued Aug. 7, 1984; U.S. Pat. No. 4,814,746 by Miller et al., issued Mar. 21, 1989; U.S. Pat. No. 4,876,541 by Storer, issued Oct. 24, 1989; U.S. Pat. No. 5,153,591 by Clark, issued Oct. 6, 1992; U.S. Pat. No. 5,373,290 by Lempel et al., issued Dec. 13, 1994; U.S. Pat. No. 5,838,264 by Cooper, issued Nov. 17, 1998; and U.S. Pat. No. 5,861,827 by Welch et al., issued Jan. 19, 1999.
In the above dictionary based LZ compression and decompression systems, the compressor and decompressor dictionaries may be initialized with all of the single character strings of the character alphabet. In some implementations, the single character strings are considered as recognized although not explicitly stored. In such systems the value of the single character may be utilized as its code and the first available code utilized for multiple character strings would have a value greater than the single character values. In this way the decompressor can distinguish between a single character string and a multiple character string and recover the characters thereof. For example, in the ASCII environment, the alphabet has an 8 bit character size supporting an alphabet of 256 characters. Thus, the characters have values of 0-255. The first available multiple character string code can, for example, be 258 where the codes 256 and 257 are utilized as control codes as is well known.
In the prior art dictionary based LZ compression and decompression systems, data character strings are stored and accessed in the compressor and decompressor dictionaries utilizing well known searchtree architectures and protocols. The compressor of said U.S. Pat. No. 6,426,711 utilizes a new string storage and access architecture involving character tables which, it is believed, improves the performance of LZ type data compression algorithms.
In numerous environments, a data compressor and decompressor are utilized at the same location. For example, in a disk or tape storage system, the data entering the system for storage may be compressed at the input and the compressed stored data may be decompressed at the output to recover the original information. In such systems, it may be desirable to utilize the same architecture for the compressor and decompressor so that the same compressor/decompressor resources may be shared thereby.
It is believed that a decompressor implemented with the character table architecture of the compressor of said U.S. Pat. No. 6,426,711 is not available in the prior art. It is, therefore, an objective of the present invention to provide such a decompressor.
The objective of the present invention is achieved by a decompressor that recovers and outputs a stream of data characters corresponding to an input stream of compressed codes, the data characters being from an alphabet of data characters. The decompressor includes a plurality of character tables, corresponding to respective characters of the alphabet, storing strings of data characters, the stored strings having respective string codes associated therewith. A string comprises a prefix string of at least one of the characters followed by an extension character. A particular string is stored in the character tables by storing the code associated with the prefix string of the particular string in the character table corresponding to the extension character of the particular string at a character table location corresponding to the string code of the particular string. The character tables are accessed with a currently received compressed code so as to recover a string corresponding thereto from the character tables, thereby providing a recovered string. The characters of the recovered string are output, thereby providing the output stream of data characters. An extended string is inserted into the character tables comprising the string corresponding to the compressed code received previously to the currently received compressed code extended by the first character of the recovered string. The extended string is stored in the character tables by storing the previously received compressed code in the character table corresponding to the first character of the recovered string at the character table location corresponding to the string code assigned to the extended string.