The Ziv-Lempel (ZL) data compression algorithm is organized around a translation table, referred to here as a dictionary, which is a set of fields stored in a memory of a data processing system. The dictionary maps a variable-length string of input characters (referred to as a "phrase") into a fixed length code (referred to as index). The compression/decompression method may operated in either of two well-known ways: using a Adaptive Ziv-Lempel (AZL) algorithm, or using a Static Ziv-Lempel (SZL) algorithm. For AZL compression operation, at any instant in time, the dictionary contains phrases that have been encountered previously in a message being compressed. At any time for AZL, the dictionary consists of a running sample of phrases in the message, so the available phrases reflect the statistics of the message. For SZL compression, the operation is the same, except that there is no updating of the dictionary.
AZL dictionary structures are well-known in the prior art, in which the adaptive dictionary is uniquely generated for each input data file to enable the compression and expansion of the file. These adaptive dictionaries are built while compressing or expanding the data, and are tailored to each respective data file. When the data to be compressed consists of a small number of atomic symbols (bytes), there is little compression obtained from an AZL method because very little about the data can be learned from a small amount of data. The only resource is to use the knowledge about the data before starting the compression process.
Processor performance may be significantly increased by using a static dictionary and the SZL process taught in application Ser. No. 07/968,631, which teaches a novel performance-improving dictionary structure generated from the records in a large database. This SZL dictionary does not adapt to the record being currently compressed for transmission or storage. SZL dictionaries are located at both transmitting and receiving locations, and they need not be transmitted with the data. At the source of the transmission, the dictionary is used for compression. The compressed data is then transmitted to the destination, where a copy of the same dictionary is used to expand the compressed data. Thus, the speed of transmission is significantly improved as only the compressed data needs to be transmitted.
Each entry in a dictionary represents a character string (phrase), and if the phrase in the message to be compressed is the same as the phrase represented by a dictionary entry, the compressed form for the phrase is the dictionary entry number, referred to as the index. Hence, the compressed data consists of a sequence of indices, which are transmitted. At the other end of the transmission, each received index is expanded into the phrase it represents.
The AZL and the SZL process where the dictionary is not updated are prior methods. The compressed codes generated from these processes are recognizable and decodable by vendors and their software packages that conform to these methods. These prior compressed codes consist of a concatenation of AZL indices, referred to as Evolution-Based-Index (EBI). Indices generated using the optimized dictionary structure of application Ser. No. 07/968,631 are referred to as Storage-Optimized-Index (SOI). Also, IBM document "ESA/390 Data Compression" (form number SA22-7208 00) describes programs for converting from SOI to EBI compressed indices (compressed data). Compressed data consisting of SOIs are not recognizable over a network by available software packages that conform to commercial adaptive AZL processes, because compressed data using SOIs are different from compressed data using EBIs.