The present invention relates variable length codes. More specifically, the present invention relates to encoding variable length codes so that the variable length codes can be decoded without using conditional branching.
Variable length codes are most often used in various compression techniques to reduce the number of bits needed to store or transmit information formed from a set of symbols. For example, MPEG encoding uses variable length codes to reduce the number of bits required in a video stream. Variable length codes reduce the number of bits required for information by replacing frequently used symbols with short binary codes and only using long binary codes for infrequently used symbols. In general, most useful variable length codes are also prefix codes, i.e. no subset of any leading bits of a codeword is equivalent to another codeword.
FIGS. 1(a)-1(b) illustrates some of the benefits and detriments of using a fixed length code as compared to a variable length code for information having a set of 4 symbols A, B, C, and D. In FIG. 1(a), table 100 provides a fixed length code in column 114 and a variable length code in column 116 for the symbols A, B, C, and D in column 112. Specifically, symbols A, B, C, and D are represented as xe2x80x9c00xe2x80x9d, xe2x80x9c01xe2x80x9d, xe2x80x9c10xe2x80x9d, and xe2x80x9c11xe2x80x9d, respectively, in the fixed length code. To avoid confusion, binary numbers representing codewords are written within quotation marks and a binary number not representing a codeword includes a xe2x80x9cbxe2x80x9d at the end. In the variable length code, symbols A, B, C, and D are represented as xe2x80x9c1xe2x80x9d, xe2x80x9c01xe2x80x9d, xe2x80x9c001xe2x80x9d, and xe2x80x9c000xe2x80x9d, respectively. As illustrated in table 150 of FIG. 1(b), information represented by the symbols ACAAABABDAA is encoded as the 22 bit binary string 0010000000010001110000b using the fixed length code of FIG. 1(a). However, symbols ACAAABABDAA ca be encoded as a 17 bit binary string 10011110110100011b using the variable length code of FIG. 1(a). Thus, variable length encoding can be used to reduce the number of bits needed to represent information. For information using larger sets of symbols greater size reduction can be achieved using variable length encoding as long as some symbols are used more often than others.
While variable length codes reduce the number of bits required to store or transmit information, decoding of variable length codes is more complicated than decoding of fixed length codes. As illustrated in FIG. 2(a) decoding a fixed length code can be easily accomplished because each codeword has a fixed length so that a binary input string can be easily divided into individual codewords. Specifically, in FIG. 2(a) binary input string 0010000000010001110000b is separated into 11 2 bit codewords 210-220. Codewords 210-220 are xe2x80x9c00xe2x80x9d, xe2x80x9c10xe2x80x9d, xe2x80x9c00xe2x80x9d, xe2x80x9c00xe2x80x9d, xe2x80x9c00xe2x80x9d, xe2x80x9c01xe2x80x9d, xe2x80x9c00xe2x80x9d, xe2x80x9c01xe2x80x9d, xe2x80x9c11xe2x80x9d, xe2x80x9c00xe2x80x9d, and xe2x80x9c00xe2x80x9d, respectively. Each Codewords is then translated using a simple lookup table such as table 250 FIG. 2(b). In table 250, the codewords are the index values used to retrieve the symbols. Specifically, in table 250, codewords xe2x80x9c00xe2x80x9d, xe2x80x9c01xe2x80x9d, xe2x80x9c10xe2x80x9d, and xe2x80x9c11xe2x80x9d are translated into symbols A, B, C, and D, respectively. Thus, codewords 210-220 of FIG. 2(a) are translated into symbols 230-240, respectively. Specifically, symbols 230-240 are A, C, A, A, A, B, A, B, D, A, and A, respectively.
A common way to decode variable length codes is to create a lookup table that is indexed using a subset of the leading bits of the binary input string. Typically, the size of this subset is equal to the size of the longest codeword (i.e., number of bits in the longest codeword). For example, FIG. 3(a) illustrates a lookup table 310 that can be used to decode the variable length code of FIG. 1(a). Lookup table 310 is indexed by a 3 bit binary number, which is formed by the a subset of the three leading bits from the binary input string. Lookup table 310 provides the symbol and the size of the codeword that corresponds to each of the 3 bit binary numbers. Specifically, binary numbers 000b, 001b, 010b, 011b, 100b, 101b, 110b, and 111b correspond to symbols D, C, B, B, A, A, A, and A, respectively. Similarly, 3 bit binary numbers 000b, 001b, 010b, 011b, 100b, 101b, 110b, and 111b correspond to codewords of sizes 3, 3, 2, 2, 1, 1, 1, and 1, respectively.
FIG. 3(b) illustrates the use of table 310 to decode the 17 bit binary input string 10011110110100011b (321 in FIG. 3(b)). First the subset of the 3 leading bits (100b) of binary input string 321 is used as an index value in lookup table 310 (FIG. 3(a)). With index value 100b, lookup table 310 provides symbol A for decoded word 331. Lookup table 310 also provides that the size of the codeword corresponding to 100b is only 1 bit. Thus, the first bit of binary input string 321 is xe2x80x9cconsumedxe2x80x9d in the decoding resulting in 16 bit binary input string 322 (0011110110100011b). The subset of the three leading bits of binary input string 322 (001b) is used as an index value in lookup table 310. With index value 0001b, lookup table 310 provides symbol C and a codeword size of 3 bits. Symbol C is added to decoded word 331 to form decoded word 332 (AC). Because the codeword size corresponding to index value 001b is 3, the first three bits of binary input string 322 are consumed and a 13 binary input string 323 (1110110100011b) remains to be decoded. The subset of the three leading bits of binary input string 323 (111b) is used as an index value to lookup table 310. With an index value of 111b, lookup table 310 provides symbol A and a codeword size of 1 bit. Symbol A is added to decoded word 332 to form decoded word 333 (ACA). Because the codeword size corresponding to 111b is 1, the first bit of binary input string 323 is consumed and a 12 bit binary input string 324 (1110110100011b) remains to be decoded. This process continues until all bits are consumed and the full decoded word (ACAAABABDAA) is obtained. As illustrated by lookup table 310, which includes eight entries, the size of the decoding lookup table is much larger than the number of symbols. Large lookup tables can reduce decoding performance in a variety of ways. For example, in a software implementation the decoding lookup table may not fit in the cache of a general purpose computer.
Another conventional method of decoding information encoded with variable length codes uses a primary lookup table with one or more secondary lookup tables. The primary lookup table is indexed using a first subset of leading bits of the binary input string that is less than the size of the largest codeword. Each set of bits is compared to one or more reserved values, which indicate that another secondary table should be used. Secondary tables can also include reserved values to indicate a tertiary table should be used. The secondary lookup tables are typically indexed using a second subset of bits following the first subset of leading bits from the binary input string. FIGS. 4(a)-4(c) illustrates this method. FIG. 4(a) includes primary lookup table 410 and FIG. 4(b) includes a secondary lookup table 420. Primary lookup table 410 is indexed using 2 bits of data. The index value 00b is a reserved value that indicates secondary lookup table 420 should be used. Index values 01b, 10b, and 11b correspond to symbols B, A, and A, respectively. Similarly, index values 01b, 10b, and 11b correspond to codewords of size 2, 1, and 1, respectively. Secondary lookup table 420 is only used if a subset of two leading bits of the binary input string is 00b. Secondary lookup table includes only two entries for index values of 0b, which corresponds to symbol D and codeword size of 3, and index value 1b, which corresponds to symbol C and codeword size of 3.
FIG. 4(c) illustrates the use of primary lookup table 410 and secondary lookup table 420 to decode the 17 bit binary input string 10011110110100011b (421 in FIG. 4(c)). First the subset of two leading bits (10b) of binary input string 421 are compared with 00b. Because 10b is not equal to 00b, 10b is used as the index value with primary lookup table 410, which provides Symbol A for decoded word 431. Primary lookup table 410 also provides that the size of the codeword corresponding to index value 10b is only 1 bit. Thus, the first bit of binary input string 421 is xe2x80x9cconsumedxe2x80x9d and a 16 bit binary input string 422 (0011110110100011b) remains. The subset of two leading bits of binary input string 422 (00b) is compared with the reserved value 00b. Because 00b is a reserved value, a subset of 1 bit (1b) following the subset of two leading bits of the binary input string 422 (shown as binary input string 422a in FIG. 4(c)) is used as an index value to index secondary lookup table 420. For an index value of 1b, Secondary lookup table 420 provides symbol C and a codeword size of 3 bits. Symbol C is added to decoded word 431 to form decoded word 432 (AC). Because the codeword size corresponding to 1b is 3, the first three bits of binary input string 422a are consumed and a 13 bit binary input string 423 (1110110100011xe2x80x3)remains to be decoded. The subset of two leading bits of binary input string 423 (11b) is compared to the reserved value 00b. Because 11b is not a reserved value, 11b is used to index primary lookup table 410, which provides symbol A and a size of 1 bit. Symbol A is added to decoded word 432 to form decoded word 433 (ACA). Because the codeword size corresponding to index value 111b is 1, the first bit of binary input string 323 is consumed and a 12 bit binary input string 424 (1110110100011b)remains to be decoded. This process continues until all bits are consumed and the full decoded word (ACAAABABDAA) is obtained. The combined size of primary lookup table 410 and secondary lookup table 420 is less than the size of lookup table 310. Thus, the memory requirement used in the method using multiple lookup tables is generally less than the method using a single lookup table. However, the method using multiple lookup tables require the use of conditionals (e.g. xe2x80x9cifxe2x80x9d statements) to determine whether the initial index value corresponds to a reserved value. Conditionals can reduce decoding performance in modern processors in a variety of ways. For example, pipelined processors attempt to predict the results of a conditional statement. If the prediction is incorrect, the processor is stalled and the processor""s pipeline must be cleared, which greatly reduces the performance of the processor. Hence, there is a need for a method to minimize the size of lookup tables without using conditional statements that may reduce performance.
Accordingly, decoding an binary input string encoded using variable length codes in accordance with the present invention is accomplished using an offset lookup table and a symbol lookup table, which are generally smaller than a single lookup table, without the use of conditionals that may reduce the performance of a decoder. Specifically, in accordance with one embodiment of the present invention, decoding of a binary input string can be accomplished by reading a first subset of leading bits from the binary input string. Then, an offset index value for the offset lookup table is calculated based on the first subset of leading bits. The offset index value is used to index the offset lookup table so that an offset value is retrieved from the offset lookup table. Then, a symbol index value for the symbol lookup table is calculated from the offset value and a second subset of leading bits from the binary input string. The symbol corresponding to a third subset of leading bits form the binary input string is retrieved from the symbol lookup table.
Some embodiments of the present invention calculate the offset index value by calculating the most significant bit position of a binary state (i.e. 1b or 0b) of the first subset of leading bits. Other embodiments of the present invention calculate the offset index value by calculating the number of leading zeros or leading ones in the first subset of leading bits.
In some embodiments of the present invention, the offset lookup table also provides the size of the second subset of leading bits. For these embodiments the symbol index value is calculated by first converting the second subset of leading bits into an integer value. Then, the integer value of the second subset is added to the offset value to produce the symbol index value.
The present invention will be more fully understood in view of the following description and drawings.