A sequence of symbols, wherein the symbols are chosen from an alphabet or a symbol set, can be compressed by entropy coding. An entropy coding engine assigns codewords for symbols based on the statistical model, i.e., the probability distributions of symbols. In general, more frequently used symbols are entropy coded with fewer bits and less frequently occurring symbols are entropy coded with more bits.
Entropy coding has been studied for decades. Basically, there are three types of entropy coding methods: variable length coding (VLC), like Huffman coding, arithmetic coding, and dictionary-based compression, like Lempel-Ziv (LZ) compression or Lempel-Ziv-Welch (LZW) compression.
The VLC codes use integer number of bits to represent each symbol. Huffman coding is the most widely used VLC method. It assigns fewer bits to a symbol with greater probability, while assigning more bits to a symbol with a smaller probability. Huffman coding is optimal when the probability of each symbol is an integer power of ½. Arithmetic coding can allocate a fractional number of bits to each symbol so that it can approach the entropy more closely. Huffman coding and arithmetic coding have been widely used in existing image/video compression standards, e.g., JPEG, MPEG-2, H.264/AVC. The LZ or LZW utilizes a table based compression model where table entries are substituted for repeated strings of data. For most LZ methods, the table is generated dynamically from earlier input data. The dictionary based algorithm has been employed in, for example, GIF, Zip, PNG standards.