1. Field of the Invention
The invention relates to LZ data compression and decompression systems particularly with respect to the LZW compression and decompression methodologies. More particularly, the invention relates to a decompressor suitable for recovering the data character stream corresponding to the compressed code output stream of a data compressor of the type disclosed in said Ser. No. 09/977,635.
2. Description of the Prior Art
Professors Abraham Lempel and Jacob Ziv provided the theoretical basis for LZ data compression and decompression systems that are in present day widespread usage. Two of their seminal papers appear in the IEEE Transactions on Information Theory, IT-23-3, May 1977, pp. 337-343 and in the IEEE Transactions on Information Theory, IT-24-5, September 1978, pp. 530-536. A ubiquitously used data compression and decompression system known as LZW is described in U.S. Pat. No. 4,558,302 by Welch, issued Dec. 10, 1985. LZW has been adopted as the compression and decompression standard used in the GIF image communication protocol and is utilized in the TIFF image communication protocol. GIF is a development of CompuServe Incorporated and the name GIF is a Service Mark thereof. A reference to the GIF specification is found in GRAPHICS INTERCHANGE FORMAT, Version 89a, Jul. 31, 1990. TIFF is a development of Aldus Corporation and the name TIFF is a Trademark thereof. Reference to the TIFF specification is found in TIFF, Revision 6.0, Final xe2x80x94Jun. 3, 1992.
LZW has also been adopted as the standard for V.42 bis modem compression and decompression. A reference to the V.42 bis standard is found in CCITT Recommendation V.42 bis, Data Compression Procedures For Data Circuit Terminating Equipment (DCE) Using Error Correction Procedures, Geneva 1990.
Further examples of LZ dictionary based compression and decompression systems are described in the following U.S. patents: U.S. Pat. No. 4,464,650 by Eastman et al., issued Aug. 7, 1984; U.S. Pat. No. 4,814,746 by Miller et al., issued Mar. 21, 1989; U.S. Pat. No. 4,876,541 by Storer, issued Oct. 24, 1989; U.S. Pat. No. 5,153,591 by Clark, issued Oct. 6, 1992; U.S. Pat. No. 5,373,290 by Lempel et al., issued Dec. 13, 1994; U.S. Pat. No. 5,838,264 by Cooper, issued Nov. 17, 1998; and U.S. Pat. No. 5,861,827 by Welch et al., issued Jan. 19, 1999.
In the above dictionary based LZ compression and decompression systems, the compressor and decompressor dictionaries may be initialized with all of the single character strings of the character alphabet. In some implementations, the single character strings are considered as recognized although not explicitly stored. In such systems the value of the single character may be utilized as its code and the first available code utilized for multiple character strings would have a value greater than the single character values. In this way the decompressor can distinguish between a single character string and a multiple character string and recover the characters thereof. For example, in the ASCII environment, the alphabet has an 8 bit character size supporting an alphabet of 256 characters. Thus, the characters have values of 0-255. The first available multiple character string code can, for example, be 258 where the codes 256 and 257 are utilized as control codes as is well known.
In the prior art dictionary based LZ compression and decompression systems, data character strings are stored and accessed in the compressor and decompressor dictionaries utilizing well known searchtree architectures and protocols. The compressor of said Ser. No. 09/977,635 utilizes a new string storage and access architecture and protocols involving limited length character tables which, it is believed, improves the performance of LZ type data compression algorithms. In the compressor of said Ser. No. 09/977,635, extended strings are excluded from storage when a character table location is unavailable in which to store the string because of character table exclusion or character table length limitation. When the extended string is excluded from storage, the string code that otherwise would have been assigned thereto is instead assigned to a subsequently stored string thereby utilizing a compact assignment of string codes.
It is an objective of the present invention to provide an efficient decompressor suitable to recover the data character stream corresponding to the compressed code output from a compressor of the type described in said Ser. No. 09/977,635.
It is a further objective of the present invention to provide a decompressor that selectively decompresses the compressed code stream from a compressor of the type described in said Ser. No. 09/977,635 or a standard LZW compressed code stream.
The objectives of the present invention are achieved by a decompressor that recovers and outputs a stream of data characters corresponding to an input stream of compressed codes. The decompressor includes storage means that stores strings of data characters, the stored strings having respective codes associated therewith. A currently received compressed code accesses the storage means to recover a string corresponding to the currently received compressed code. The decompressor outputs the characters of the recovered string so as to provide the output stream of data characters. An extended string is inserted into the storage means that comprises the string corresponding to the compressed code received previously to the currently received compressed code extended by the first character of the recovered string. The decompressor assigns a code to the stored extended string and maintains a count of extended strings which have a predetermined characteristic that are inserted into the storage means. The inserting of an extended string into the storage means and the assigning of a code to the extended string are controllably bypassed for extended strings having the predetermined characteristic when the count of such strings attains a predetermined limit.
In particular, the inserting of extended strings into the storage means and assigning of codes thereto are controllably bypassed when the count of inserted extended strings having the same extension character attains the predetermined limith. Such counts are maintained for the various characters of the alphabet and respective limits are predetermined for performing the controllable bypassing function.
An optional feature of the below described decompressor involves defaulting to a standard LZW data decompression configuration when the input stream of compressed codes is not received from a data compressor of the type described in said Ser. No. 09/977,635. The default configuration bypasses maintaining the counts of inserted extended strings that have the predetermined characteristics and bypasses the controllable bypassing of the extended string inserting and code assigning functions.