1. Field of the Invention
The present invention generally relates to the compression of digital data for efficient transmission or storage, and more particularly to the compression of data whose format, such as block size, is not known beforehand.
2. Description of the Prior Art
A message may be represented in digital form using any number of well known coding methods. Typically, the objective is to choose the coding technique which expresses the digital data using the least number of bits possible. Representing the data in its most concise form enables a system to process and transmit the data more quickly, and to store the data using the least amount of storage space.
Often a message is input to a digital system in a form which is less than optimal. The system must then convert the input data stream to a more compact form. This processing is known as data compression (or sometimes simply referred to as source encoding).
Several different compression techniques are known in the art. In one common technique, a digital message is broken into a series of blocks, and each block is separately encoded by reference to a previously encoded block. For instance, in the well-known Lempel-Ziv coding technique, a table is employed which stores a list of previously encountered data blocks. A block currently being processed is compared with entries in the table. If the current block matches an entry in the table, the encoding module encodes the current block by making reference to the matching entry in the table--using a pointer, for example. The pointer itself may be transmitted instead of the entire data block. Preferably, the pointer is shorter than the block itself, thus resulting in compression of the data. When the pointer is received at a receiver site, the pointer may be used to reconstruct the data block, such as by making reference to a similarly constituted table at the receiver site.
The above discussed technique is best suited for digital information which exhibits a large amount of repetition. In such a case, component blocks comprising the data message often repeat, which increases the probability that a block currently being processed will match a previously encountered block.
However, prior art techniques such as that described previously fail to live up to their full potential in compressing digital messages. A data message may be segmented by an encoding system into component blocks which are out of step with the inherent repetition interval in the original data message. In this case, a search of the table of previously encountered blocks is less likely to reveal a match, and the data will not be optimally compressed. In such a case, the system has not fully exploited the inherent redundancy in the original data message.
This problem could be readily addressed if the interval at which data repeats in the original data message is known prior to encoding the data. This cycle interval (or step interval) could then be manually programmed into the segmenting algorithm. However, this approach is practical only if the size of the encoded blocks remains fixed. It is burdensome and impractical to reconfigure the algorithm for a new block size every time a new message is processed. Moreover, a single data message may often contain distinct portions, each of which might exhibit a different repeating interval. The above technique is unsuitable for such a mixed data message, as it would require the user to reconfigure the algorithm midway through the processing of the data message.
For instance, a digital image may contain several different halftone portions. As understood in the art, halftone images are created by using a repeating finite two-dimensional halftone pattern. Each halftone image includes a plurality of halftone cells of fixed dimensions. A single page of digital image information might include a first half-tone portion using a first cell dimension, and a second half-tone portion using a second cell dimension. It would not be feasible for the user to manually determine the cell dimension employed in each distinct portion, and configure the segmenting algorithm to appropriately deal with the cell size of each distinct portion.