When source symbols are coded using code words which may have different lengths, the source symbols are translated to unique code words. This kind of coding can be called as a variable length coding (VLC). The coding may be designed so that more probable symbols are represented with shorter code words and less probable symbols are represented with longer code words. Shorter code words can be represented with less bits compared to longer code words when the code words are transmitted. One aim of the variable length coding is to reduce the amount of information needed to represent the symbols compared to the situation that the symbols were encoded as such. In other words, when a set of symbols is translated to code words, the resulting coded representation should contain fewer bits than the source. The set of symbols may include many kinds of information. For example, a set of symbols can be a file consisting of bytes, an information stream such as a video stream or an audio stream, an image, etc.
The design of variable length code words can depend on the probability statistics of the source of which the source symbols represent. To obtain a set of code words for variable length coding probability statistics can be gathered from some representative source material and the code words are designed around those statistics. This may work quite well, but in many cases statistics are not stationary and may vary in time and having fixed set of code words may not produce good compression. To achieve better compression, the set of variable length code words could be constantly adapted locally to observed statistics of the source.
One way of performing adaptation is to keep track of symbol frequencies and use the frequencies to define the set of variable length code words on-the-fly as the symbols are coded. This kind of full adaptation is quite complex operation, especially if the range of source symbols is large. In practical implementations, some form of suboptimal adaptation is performed. For example, the encoder could use a number of predefined sets of variable length code words and select one set of them based on estimation of local statistics. In another implementation coder could gradually adapt the code words of the set so that only few of the individual code words of the set are changed at a time so that the complexity per coded code word is low.
As for the generation of set of variable length code words there are some ways to do it. An example way is to use Huffman method or an adaptive version of it. Another method is to use so called universal codes (exp-colomb codes, for example) to form the set of variable length code words. The creation of universal code words is regular and so the codes are rather easy to decode. However, optimal encoding may not be achieved in many cases and symbols need to be kept ordered according to the symbol frequency.