Modern information sources, such as imaging, non-imaging, voice data, etc., are often digitized to a fixed quantization level, for example, 8 or 12 bits/sample of digital data. These will further be processed, modulated for transmission or stored on mass media. The actual information content of such sources can be much lower than its quantization level. An efficient coding of these sources by converting the source digital representation into another form of digital representation can reduce the total data rate tremendously.
In a distortionless source coding scheme, also known as data compression, data reduction is achieved by removing the redundancy in the data and efficiently representing the information with codewords. An optimal coding scheme will produce expected codeword length close to the information measure, that is, the entropy of the data source. It was known that the Huffman code is an optimal block code given a source with known statistics. For actual data sources, due to the varying statistics in the source, several Huffman codebooks will have to be used to optimally adapt to the source variation. R. F. Rice proposed in 1979 a scheme which effectively adapts to the source without the need of storing codebooks. This scheme includes an option, .psi..sub.0, to operate at low source information content. This option first generates the comma code of a block of samples, then groups every three bits and codes them with a Huffman code. Run-length code, followed by Huffman code, has been used in products like the fax machine at low source entropy rate. Another scheme with potential for coding at low entropy rate is the arithmetic codes.
The .psi..sub.0 option in the Rice algorithm requires an intermediate buffer to hold the comma codes of a block of samples, subsequent coding requires a Huffman table to generate the codeword. The decoding is more complicated due to the lack of structure in the Huffman Table. Run-length code is effective when the input symbol has only two levels. For symbols with multiple levels, which is mostly true of imaging data, the additional complexity for coding levels decreases the efficiency of the code. Arithmetic coding technique, with its promise to produce codeword length closer to the entropy than Huffman code, requires tables of probability distribution of the data or adapts slowly to the variation of data statistics.