1. Field of the Invention
The present invention relates generally to lossless data compression. More particularly, the present invention relates to repeating compression tasks of data generated by similar sources and possible enactments of universal data compression to utilize the attributes of such sources.
2. Discussion of the Related Art
The performance of data compression depends on what can be determined about the characteristics of the source. When given an incoming data stream, its characteristics can be used to devise a model for better prediction of forecoming strings. If such characteristics are determined prior to compression, a priori knowledge of source characteristics can be obtained, providing a significant advantage and allowing for more efficient compression. However, in most cases a priori knowledge of the source characteristics cannot be determined. This often occurs in real-world applications where properties of a source are dynamic. In particular, the symbol probability distribution of a source usually changes along the time axis.
Some substitutional compression processes can be used to compress such data, since they do not require a priori knowledge of the source properties. Such processes can adaptively learn the source characteristics on the fly during the coding phase. Moreover, the decoder can regenerate the source characteristics during decoding, so that characteristics are not required to be transmitted from encoder to decoder.
These compression processes can be applied to universal data content and are sometimes called universal data compression algorithms. The LZ compression algorithm is a universal compression algorithm that is based on substitutional compression. The main reason for LZ compression algorithm to work universally is the adaptability of the dictionary to the incoming stream. In general, the LZ compression algorithm processes the input data stream and then adaptively constructs two identical buffers of a dictionary at both the encoder and the decoder. Without explicit transmission of the dictionary, this building process is performed during the coding and decoding of the stream, and the dictionary is updated to adapt to the input stream. Matching procedures using this adapted dictionary are expected to give the desirable compression result, since the dictionary reflects incoming statistic quite accurately. Many applications, which may benefit from data compression, have repeating usage patterns. Examples for such applications are: a client/server application working session which repeats frequently, or a periodic remote backup process. There is therefore a need for a priori knowledge about the source data.