Arithmetic coding is an entropy coding scheme that addresses certain shortcomings of other current encoding methods, such as Huffman coding. For example, current methods require an integral number of bits for each element of data to be encoded. However, elements with nonintegral entropy require a nonintegral number of bits in the code stream to achieve optimal compression. In addition, the probabilities for each element to be encoded can vary based on a coding context (e.g., the contents of neighboring elements or recently processed elements). One method of addressing the varying probabilities employs a coding table for each context to properly model the conditional probability. However, as the number of contexts rises, the inefficiencies also increase.
Furthermore, the probabilities for each element may vary significantly over time and thus require adaptive, dynamic modifications, which can be expensive in terms of time and/or hardware resources. However, while providing improved results on matching the entropy of the input stream and addressing the issued outlined above, arithmetic coding introduces other implementation difficulties.
Most straightforward implementations of arithmetic coding (particularly those implemented in hardware) require that all of the elements to be coded be binary elements. This generally requires that the potentially multi-bit symbol be ‘binarized’ to a stream of binary digits (bits) (or ‘bins’ in the parlance of the H.264 standard). Furthermore, most hardware implementations code only one bit per clock cycle, and in some cases fewer when multi-bit re-normalization is required.
For some coding standards, the worst case (highest) number of bits being supplied to an arithmetic encoder or out of a corresponding arithmetic decoder can be quite large. For example, an apparatus using the H.264 standard for processing video data and running at a clock rate of 200 MHz, may be required to process 10-20 bits per clock cycle to keep up with real time requirements in the worst case. However, typical implementations handle, at best, one bit per clock cycle.