Data compression can reduce the cost of storing large data files on computers, as well as the time for transmitting large data files between computers. The compression may be lossless, lossy, or a combination of the two. Lossy compression may be performed without significantly affecting the quality of data that will eventually be reconstructed. Lossless compression may be performed without affecting quality of reconstructed data.
A process called xe2x80x9centropy codingxe2x80x9d is fundamental to data compression. Generally, this process is modeled by first defining a data source that provides data symbols Si belonging to the set {0, 1, 2, . . . , Mixe2x88x921} (the source alphabet) for integer indexes i=0, 1, 2, . . . ; and then converting the data symbols to a set of bits (e.g., a binary alphabet). An objective of the entropy coding is to minimize of the number of bits required to represent the data symbols uniquely, without any loss of information.
One type of entropy coding is xe2x80x9cprefixxe2x80x9d coding. Prefix coding involves assigning an integer number of bits to each coded data symbol. Prefix codes have the property that no ambiguity about their code words is created by their concatenation. As bits are read sequentially to a decoder, the decoder always knows when it reaches the end of a code word and can put out a decoded data symbol. These codes are best represented by a tree structure, which guarantees the prefix property, and also permits visual interpretation of the code""s properties. Every uniquely-decodable code can be translated into a prefix code with same compression properties The tree coding can be computationally efficient with table lookup (xe2x80x9cTLUxe2x80x9d) for both encoding and decoding.
In practical applications, the conditional probabilities of the data symbols (M) are not known a priori. The computational complexity to determine an optimal tree code is proportional to M log(M) in the worst case, and proportional to M in the best case. Thus, when M is large, it is not practical to compute new optimal codes frequently, leading to loss in compression.
Moreover, table look-up is very fast only when the tables are not too large (preferably, when they can fit in CPU fast cache, or are accessed sequentially). If M is large, then the amount of memory to store the estimates, codes, and tables, also becomes prohibitively large.
A type of prefix codes called xe2x80x9cGolomb codesxe2x80x9d is optimal for certain common data symbol distributions. Each Golomb code is defined uniquely by a positive integer number m.
Golomb codes are defined for an infinite number of symbols. This is an advantage when working with large alphabets, the exact size of which is unknown (which is typically the case with one-pass coding). The code words for the most frequent symbols can be stored in tables, while the codes for the improbable symbols can be generated automatically.
In the special cases when m=2k (for k=0, 1, 2, . . . ), code words can be generated for all possible values using exactly k bits, which is quite advantageous in practical applications. These particular codes are called Rice-Golomb codes.
A problem with the use of Golomb and Rice-Golomb codes is that they are not resilient to estimation errors in the code parameter m. The coding inefficiency grows too fast around an optimal point, so there is a large penalty (in bits) whenever the parameter m is not correctly estimated. For example, suppose the classification function estimates that the best Golomb code parameter is m=1. The symbol Si=300 will be coded using 300 bits equal to 1, followed by a bit equal to 0. This is very inefficient.
According to one aspect of the present invention, a prefix code set is defined by a set of parameters that define a number of code words in an initial class and the growth of at least one additional class of code words. The code is resilient to initial estimation errors in the code parameters.
Other aspects and advantages of the present invention will become apparent from the following detailed description, taken in conjunction with the accompanying drawings, illustrating by way of example the principles of the present invention.