Modern computing applications frequently benefit from the use of lossless data compression, which is a class of data compression wherein the exact original data may be restored from the compressed form of the data without any loss of information. Lossless data compression may be used, for example, to compress databases files, documents, executable files, or other types of files where even minor differences between the original data and the decompressed data may not be tolerated.
One technique for performing lossless data compression is known as Huffman encoding, a process wherein symbols may be encoded into variable length bit strings based on the actual or estimated frequency of occurrence of those symbols in the original data. Each symbol in a Huffman encoding scheme may represent, for example, a single character. The more frequently occurring symbols are assigned to shorter bit strings, while less frequently occurring symbols are assigned to longer bit strings. Huffman encoding uses “prefix-free codes”, where the bit string for any given symbol is never a prefix for the bit string of any other symbol.
In order for a computer system to decode a set of Huffman encoded data, the computer system may construct a tree structure, then use the sequence of bits of the encoded data to traverse the tree structure in order to correlate each of the prefix-free codes in the encoded data with the appropriate symbol.