The present invention relates to a method of and an apparatus for compressing and decompressing data and a data processing apparatus and a network system using the same in which data including a character string or the like is converted into a bit string including a number of data bits, the number of data bits being smaller than that of the original data.
The data compression technology is used, for example, to reduce the ratio of data occupying a storage facility such as a hard disk and the amount of data to be transferred in data communication, thereby improving utilization efficiency of the storage device and communication path. The representative data compression methods of the prior art include LZ78 and an improved variation thereof described in pages 221 to 247 of the "Data Compression Handbook" published from Toppan in 1994.
The LZ78 and its variation include the following basic steps to compress data according to a predetermined rule.
(1) Character strings appearing in input data are stored in the form of a set of character strings in a storage. The set is called a dynamic dictionary. PA0 (2) When a character string already stored in the dynamic dictionary appears again in the input data, an index (a positive integer in general) of the character string is generated as output data in place of the character string in the dynamic dictionary. PA0 (3) When the dynamic dictionary is full of character strings thus accumulated therein, the registration of character strings is stopped or registered character strings are deleted. In the deleting operation, all character strings are deleted or the character strings are appropriately deleted beginning at the oldest character string. PA0 (1) A character string first appearing in the input data has not been registered to the dynamic dictionary. Therefore, the character string cannot be substituted for an index and hence is directly outputted to the dictionary. Namely, the compression ratio is conspicuously decreased in the leading or first portion of the input data. PA0 (2) Since character strings of the input data are sequentially registered to the dynamic dictionary, there may possibly take place a dictionary overflow. To cope with the difficulty, the stored character strings are to be deleted, for example, as follows.
According to the method of the prior art, since a character string including a plurality of characters and/or letters can be replaced with an index, the data volume is minimized through the data compression. Additionally, the compressed data can be easily decompressed by achieving the above processing steps in the reverse direction according to the rule.
However, the conventional data compression method using the dynamic dictionary is attended with the following problems.
(a) The registration of character strings to the dictionary is interrupted. PA1 (b) All character strings stored in the dictionary are entirely deleted and then the dictionary is initialized. PA1 (c) Older character strings are deleted from the dictionary to preserve less older character strings therein.
In either cases of the deleting procedures, the chance of either one of the character strings in the dynamic dictionary to match a character string in the input data cannot be necessarily increased. In general, the probability of a character string to match either one of the character strings in the dynamic dictionary is small and hence the compression ratio is decreased.