An encoder is a process which maps an input sequence of symbols into another, coded, sequence of symbols in such a way that another process, called a decoder, is able to reconstruct the input sequence of symbols from the coded sequence of symbols. The encoder and decoder pair together are referred to as a “codec.”
As shorthand, a finite sequence of symbols is often referred to as a string so one can refer to the input string and the coded string. Each symbol of an input string is drawn from an associated, finite, alphabet of input symbols I. Likewise, each symbol of a coded string is drawn from an associated, finite alphabet of code symbols C.
Each alphabet contains a distinguished symbol, called the <end> symbol. Each and every string terminates in the associated <end> symbol and the <end> symbol may only appear at the terminal end of a string. The purpose of the <end> symbols is to bring the codec processes to an orderly halt.
Any method of determining the end of an input or code string can be used to synthesize the effect of a real or virtual <end> symbol. For example, in many applications the length of the input and/or the coded string is known and that information may be used in substitution for a literal <end> string.
The encoder mapping may be denoted by Φ so that if u is an input string and v is the corresponding coded string, one can write: v=Φ(u). Likewise, the decoder mapping will be denoted by Ψ and one can write: u=Ψ(v), with the requirement that: u=Ψ(Φ(u)).
There is no requirement for Φ(Ψ(v)) to reconstruct v. A codec (Φ,Ψ) is called a binary codec if the associated alphabets I and C each contain just two symbols in addition to the <end> symbol. If a, b, and <end> are the three symbols in a binary alphabet, the useful function ˜ is defined to be: ˜a=b, ˜b=a, ˜<end>=<end>.
Codecs, as described so far, do not have a practical implementation as the number of input strings (and the number of code strings) is infinite. Without placing more structure and restrictions on a codec, it cannot be feasibly implemented in a finite machine, much less have a practical implementation.
A significant subset of codecs can be practically implemented by the well-known finite state transducer. A finite state transducer (FST) is an automaton that sequentially processes a string from its initial symbol to its terminal symbol <end>, writing the symbols of the code string as it sequences. Information is sequentially obtained from the symbols of the input string and eventually represented in the code string.
To bridge the delay between obtaining the information from the input string and representing it in the code string, the FST maintains and updates a state as it sequences. The state is chosen from a finite set of possible states called a state space. The state space contains two distinguished states called <start> and <finish>. The FST initiates its process in the <start> state and completes its process in the <finish> state. The <finish> state should not be reached until the <end> symbol has been read from the input string and an <end> symbol has been appended to the code string.
Because the state space is finite, it is not possible to represent every encoder as an FST. For reasons of practicality, the present description focuses on codecs where both the encoder and decoder can be described and implemented as FSTs. If the encoder Φ can be implemented as an FST, it can be specified by means of an update function φ. The first input symbol a from the input string is combined with the current state s1 and produces the next state s2. The first symbol is conditionally removed from the beginning of the input string. The produced code symbol b is conditionally appended to the code string.
The function φ is undefined if the current state is <finish> and the FST terminates sequencing. To summarize: (s2, b)=(φs(s1, a), φb(s1, a))=φ(s1, a). Here, φs(s1, a) is by definition the first component of φ(s1, a) and φb(s1, a) is by definition the second component of φ(s1, a).
For many applications, including entropy coding, it is useful to equip the FST with a Markovian probability structure. Given a state s1 and an input symbol a, there is a probability Prob(a|s1) that, given the FST is in state s1, that a will be the next symbol read. Depending on the application, this probability may be stipulated, may be statically estimated from historical data, or may be dynamically estimated from the recent operation of the FST. In this latter case, the information on which the probability estimate is based may be encoded in the state space.
From this, one can calculate Prob(s2|s1), the probability that, given the FST in state s1, that the FST will next be in state s2. This is calculated by case analysis as: Prob(s2|s1)=(φs(s1, a)==s2) Prob(a|s1)+(φs(s1, ˜a)==s2)Prob(˜a|s1).
This set of Markovian state transition probabilities can be assembled into a stochastic matrix M where Mij=Prob(sj|si). The asymptotic state probabilities P(s) can be calculated as the elements of the right eigenvector of M corresponding to the largest eigenvalue 1.