This invention relates to a method of carrying out the Viterbi algorithm for decoding convolutionally coded data, and a circuit for practicing this method.
Convolutional codes are employed in mobile communication systems ranging from cellular telephone networks to earth satellite systems. Briefly, a convolutional coder generates output bits from the k most recent input bits, where k is an integer referred to as the constraint length. The most recent k-1 of these bits identify what is called a state, with input of each new bit causing a transition to a new state. At any given time there are 2.sup.k-1 possible states.
The Viterbi decoding algorithm is a maximum-likelihood method that receives an encoded data sequence, then decides what inputs to the convolutional coder were most likely to have produced that sequence. More specifically, for each of the 2.sup.k-1 possible states at each time in the received sequence, one bit of information, called a comparison result bit, is stored to indicate which of two possible preceding states is more likely. This information can be used to trace a path from a given state to the most likely preceding state, then the most likely state preceding that state, and so on back in time. Each such path has an associated path metric value, which indicates how well the path matches the signal actually received. When an encoded sequence of a certain length has been received, the path metric values of paths leading back from all 2.sup.k-1 final states are compared, the most likely path is selected, and this path is retraced to obtain the decoded data.
Viterbi decoding is commonly practiced with a large-scale integrated circuit such as a digital signal processor (DSP) or similar processor, using general-purpose random-access memory (RAM) to store the comparison result bits and path metric values. The processor communicates with the RAM by transferring words of data over a data bus having a certain width. With a sixteen-bit bus, for example, words consisting of sixteen bits each are transferred, one complete word at a time.
In some cases the Viterbi decoding algorithm is practiced with a processor having special hardware for expediting the necessary calculations, such as a comparator for comparing two path metric values, or a shift register for assembling comparison result bits into words to be written into RAM. Data are still transferred to and from the RAM over a conventional data bus as described above, however.
A problem is that the conventional data bus is not well suited for handling bit data. To store one bit in the RAM, for example, it is necessary to read an entire word, modify the appropriate bit, then write the word back at the same address. Thus if comparison result bits are stored one at a time, a great deal of unnecessary data transfer must take place. Even if a shift register is used to assemble complete words, so that they can be written to RAM without the reading and modification steps, extra processing is still required to determine when a complete word has been assembled, and to execute the transfer from the shift register to RAM.
When individual bits are read from RAM over the data bus the situation is even worse. The desired bit is obtained as the Nth bit of a word of data. To determine whether the bit is a one or a zero, the processor must, for example, logically AND the word with a special mask word having a one in the Nth bit position and zeros in the other positions. Mask words must either be generated as needed, which takes time, or stored in and read from a separate table, which requires extra memory space. Alternatively, the processor can perform an N-bit left or right shift on the word read from RAM, then determine whether the most significant bit or least significant bit is a one or zero. The shift operation takes time, however, and may have to be followed by a further logic operation to mask bits other than the most significant or least significant bit to zero.
Some processors have special bit-manipulation instructions that enable the value of an arbitrary bit in a word to be obtained in one step. Even if such instructions are used, however, the processor must still calculate the position of the bit in the word, which requires the execution of other instructions. In short, reading an individual bit from a conventional RAM over a conventional data bus involves a great deal of extra processing and consumes much unnecessary time.
One possible solution to this problem is to store each comparison result bit in a separate word, in the least significant or most significant bit position, for example. This solution wastes considerable memory space, however, because only one bit per word is used.
When a path is being traced backwards as described above, starting from a given state at a given time, the processor must first calculate the address of the word storing the comparison result bit for that state and time, and the position the bit in that word. Then it must read the value of the comparison result bit. From the comparison result bit, the processor must next execute several instructions to calculate the value of the preceding state on the path, then repeat the entire process of address calculation, bit position calculation, bit reading, and state calculation again to get the next bit. The address and bit-position calculations are particularly troublesome if the data bus width is not equal to 2.sup.k-1. To deal with different cases, the calculations may require conditional branching instructions, which can take a particularly long time to execute.
When encoded data are received continuously, after receiving a sequence of a certain length, the processor must decode the sequence, by executing the above back-tracing operations, before it begins receiving the next sequence. The back-trace must therefore be completed quickly, but the many computations that are conventionally required, and the above-described problems associated with accessing bits in a conventional RAM over a conventional data bus, cause serious delays.