Numerous techniques and systems have been developed for improving the efficiency of data storage or transmission of data between remote points. Data often comprises a string of coded symbols from a predetermined symbol set, referred to by the term "alphabet". A well-known example is the American Standard Code for Information Interchange (ASCII), which codes alphanumerics, punctuation marks, and various special command characters, in terms of binary values. Of course, any other symbol set, which could include numerals, punctuation or diacritical marks, or binary or other representations thereof, fall within the scope of the present subject matter. The term "alphabet" will be used in its broadest sense, to cover any such symbol set.
Some types of symbols, such as ASCII symbols, are all of equal length, In a character based data communications environment, however, not all possible characters appear with equal frequency. It is known to be inefficient to assign codes of equal length to all characters (like ASCII codes), regardless of their frequency of occurrence. To improve efficiency, various data compression schemes have been used. These schemes generally encode symbols with codes whose bit lengths increase, in a general sense, as the probability of occurrence of the symbol decreases.
Such a technique of data compression or encoding is referred to as "entropy encoding". In entropy encoding, more probable events are represented by codewords characterized by a relatively small number of bits, whereas less probable events are represented by a relatively large number of bits. The correct assignment of codeword lengths is dictated by information theory concepts, and is based on the estimated probability of occurrence of the events. The better the probability estimate, the more efficient the codeword length assignment, and the better the compression.
It would be possible to calculate a frequency-of-occurrence distribution by accumulating a quantity of data, calculating overall probabilities for each symbol in the data, and encoding the data in bulk before transmission. However, it is preferable that the probability of occurrence of a symbol of an alphabet in a symbol stream be predicted causally, i.e., based on occurrences of symbols prior to the symbol to be encoded, but not on occurrences of symbols subsequent to the symbol to be encoded. This is because coded symbols are decoded as they are received. At the time of arrival of a given coded symbol, only the previously received coded symbols are available to the receiver.
Generally, the probability of an event or occurrence is initially determined as an estimate from previous data or an initial estimate based on intuition, mathematics, assumptions, statistics collections, or the like. The predictive value of the initial estimate is then measured by subsequently occurring events.
In some prior technology, probabilities of occurrence for the symbols are determined before data processing, and then remain fixed. Data are encoded in accordance with these fixed probabilities. These systems have the drawback that results obtained are based on probability values which generally do not reflect the actual rates of occurrence of the characters, since these rates vary with the positions in the data stream of the characters. As a consequence, the data stream is not coded with optimal efficiency.
One prior art data compression system relies on the frequency of a symbol occurrence in both an encoding and decoding device. In Kenemuth, U.S. Pat. No. 4,516,246, "Data Compression System", a series of characters in a data stream are encoded in compressed form by using a histogram of a sample of a symbol stream to determine the frequency of symbols in the stream. The method is adaptive, inasmuch as the frequency, or histogram, is reevaluated upon the arrival of a new symbol. However, this method has a shortcoming that encoding is based on a frequency distribution within a past interval of fixed size. This limits the sensitivity to symbol trends or tendencies.
Other prior art adaptive data compression systems rely on an adaptation of the rate probability estimation. In Duttweiler, U.S. Pat. No. 5,028,258, "Adaptive Probability Estimator for Entropy Encoding/Decoding", more accurate estimation of probability estimates are obtained by controllably adjusting the adaptation rate of the probability estimator. That is, an adaptation rate is optimized by matching it to the actual probability values being estimated. In particular, the adaptation rate is optimized to be proportional to the inverse of the smallest probability value being estimated. This method is also adaptive to the extent that the probability estimation of the lag symbol occurrences is variable. However, since the probability estimations are based on the relative frequencies of past symbol occurrences, but not on their distribution over time, actual trends in symbol occurrence frequency may still not be anticipated.
In the related Chamzas et al., U.S. Pat. No. 5,023,611, "Entropy Encoder/Decoder Including a Context Extractor", more accurate estimation of probability estimates are obtained by adjusting the configuration of the interval of a context extractor used to define the context of symbols. While this method improves on the fixed-interval method, it must necessarily ignore the effects of the non-chosen intervals at each point in time.
Therefore, to overcome the drawbacks of the conventional techniques described above, a data compression technique for use in a data communications environment should preferably weight the occurrences of the symbols so that trends in symbol occurrences are taken into account in calculating probabilities, preferably giving heavier weight to occurrences in the more recent past. Such a technique should be able to efficiently compute the best code assignments for all possible symbols based upon the distribution of symbols which have already been transmitted. It should be able to detect changes in the frequency distribution of symbols in a symbol stream and adapt its encoding scheme accordingly.