Data compression is an extremely useful tool for storing and transmitting large amounts of data. Huffman coding is a well-known technique for encoding information with as few bits as possible. Reducing the number of bits is a critical part of any system that requires compression of the information in order to conserve resources (memory, bandwidth, etc.), particularly with systems that process image data such as computer graphics and digital video. For example, the bandwidth required to transmit an image is reduced drastically when compression is used to decrease the number of bits required to recreate the image.
In Huffman coding, each possible input symbol is mapped to a variable length codeword. The length of each codeword is inversely related to the probability of occurrence of the associated symbol it represents. Thus, a frequently occurring input symbol is represented by a codeword with only a few bits while an input symbol that occurs infrequently is represented by a codeword consisting of many more bits.
Huffman coding is often used in conjunction with other algorithms such as the Discrete Cosine Transform (DCT) or wavelet transforms. Huffman coding is very effective for compressing image information. In fact, Huffman coding is part of the Joint Picture Experts Group (JPEG) standard and the Motion Picture Experts Group (MPEG) standard.
The variable length codewords in the Huffman code limit decoding throughput. Since all of the codewords are of variable length, it is impossible to identify the boundaries between individual codewords prior to decoding. Thus, codewords cannot be decoded independently of one another since each codeword is recursively related to all of the preceding codewords. Instead, to decode codeword N, it is necessary to decode the proceeding 1 through N-1 codewords before N can be located and subsequently decoded.
Designing a very large scale integrated circuit (VLSI) based Huffman decoder requires that circuit size and speed also be considered. A straight forward solution that can process one codeword per cycle consists primarily of two components: a decoder capable of handling all possible codewords and a shifter that can shift as many bits as the largest codeword. Input data is presented to this circuit in parallel and is at least as wide as the largest codeword. In every cycle, the decoder determines the current codeword and the shifter shifts that codeword out. This solution is guaranteed to be able to process 1 codeword per cycle. The problem is that, given the size of the decoder and the shifter, performing the decode and shift in a single cycle becomes the critical timing path and limits clock speed. Another drawback is that a significant portion of the hardware in this solution is used very infrequently since most of the codewords occur very infrequently.
One prior art software decoder included a small lookup table (LUT) followed by a larger LUT. As codewords are received, the small LUT is used to decode codewords if possible. However, all codewords that cannot be decoded using the smaller decoder are decoded by the larger LUT.
Thus, what is needed is a way to decode more than one codeword per cycle while removing the decode from the critical path so as to not impede clock speed.