1. Field of the Invention
The present invention relates to data compression systems. More particularly, the present invention relates to a loss-less data decompression system and a method and means for increasing the speed of decompressing a stream of coded data in systems that employ a self-building dictionary to store string codes and character codes.
2. Description of the Prior Art
Heretofore, loss-less data compression algorithms were known, as were the algorithms for decoding the compression codes generated at the data compressor. The best known loss-less data compression algorithms are adaptive and employ a string dictionary at the compressor and at the decompressor.
The compression system generates strings of characters and searches the dictionary for the longest string match that can be found in the dictionary, then outputs a string code for the longest string found. The longest string match is stored in the compression system dictionary with the extension character which produced the mismatch. The string stored in the dictionary is assigned the next highest string code by a code counter. The compression system also outputs single character codes when they appear as the longest match string.
The decompression system receives only codes for strings and/or single character codes. Lemple Ziv Welch (LZW) data compression systems output only character codes or longest matched string codes to a decompression system having a dictionary that is preferably initialized with all single character codes, so that only plural character string codes are initially searched and character codes are sent to the decompression system from the compressor system. For a discussion of LZW see A Technique for High-Performance Data Compression by Terry A. Welch; IEEE Computer Volume 17, Number 6, June 1984.
The LZW compressor stores each new entry in its dictionary as a last match string code plus an extension character code. However, the compressor sends to the decompressor the last match string code but not the extension character. The decompressor must be arranged one step behind the compressor and must buffer two sequential input codes. The previous string code received is paired with the first character of the next or new code to form an entry in the decompressor string dictionary.
The problem with this sequence of operations is that the string codes being received from the compression system become longer and comprise numerous smaller substring codes which must be decoded. It is not unusual for a long string code to represent over fifty characters (bytes) which comprise almost as many substring codes. To decode the fifty characters represented by such a long string code it is necessary to look in the dictionary and retrieve each subcode and its extension character until all substring codes have been decoded and exhausted. Only one extension character is outputted to the output data stream each time a substring code is expanded into a new substring and its extension character, thus, the decompression system from time-to-time may cycle numerous times decoding a long string code that has already been seen. It would be desirable to eliminate the time wasted in a decompression system to expand any plural character string more than once. Stated differently it would be desirable to retrieve a set of individual characters representative of any long string code that has already been seen without resorting to repetitiously expanding substring codes.
It is a principle object of the present invention to provide a method and means for decoding compressed data faster than was heretofore possible.
It is a principle object of the present invention to eliminate repetitious decoding of long string codes by expansion of substring codes in a decompression system.
It is an object of the present invention to eliminate decoding of most substrings in a long string code.
It is an object of the present invention to eliminate redundant decoding/expansion operations in data decompressors.
It is an object of the present invention to provide a fast access memory in which are stored all characters representative of a known or previously seen long string code.
It is another object of the present invention to provide a decoding system capable of decoding complete pages of compressed data stored on web sites as fast as the data can be downloaded to the decompression system.
It is another object of the present invention to provide a method and means for decoding pages of a book or catalogs as blocks of compressed data codes.
It is another object of the present invention to provide a dictionary-type decompressor capable of real-time video image speeds of decompression without extensive buffer memories.
According to these and other objects of the present invention a novel decompressor is provided with a dictionary comprising two parts. A string dictionary is employed to build or replicate compressed data in the form of string codes and extension characters. Also, a decoded string dictionary or memory is provided to store at the addresses represented by the string codes all of the characters representative of or contained in a received string code which may be accessed as a block of characters in a single cycle.
Each compressed input code in an input data string is converted to a pointer or an address used to access data in a decoded string dictionary and to transfer blocks of characters to a utilization device. If the block of information is not in the decoded string dictionary, logic means enable the decoder to decode the string for a first time and to store different forms of the decoded string in both the string dictionary and/or in the decoded string dictionary at the same code pointer or address.