Reconfigurable arithmetic logic unit (ALU) arrays often use ALUs, registers, and memory interconnected by crossbar switches in order to allow reconfiguring for different applications. This approach requires a substantial amount of bus wiring. Another problem with these arrays is that it requires significant time to reconfigure all of the crossbar switches when changing from one hardware configuration to another. One of the most demanding processing tasks is that involving Viterbi decoding algorithm, an efficient implementation of a maximum likelihood sequence estimator using convolutional codes. In viterbi decoders, convolutionally coded symbols that are received possibly corrupted by noise, are compared with all possible expected symbols, using a specific metric (Hamming distance or Euclidean distance). The possible symbols expected depend upon the data to be decoded and the initial state of the convolutional encoder. The Viterbi decoder attempts to find the most probable set of “states” and the most probable possible input to the encoder. If K is the Viterbi decoder constraint length, then the Viterbi decoder algorithm has 2k−1 states for each symbol. Two or more paths emanate from each state at time n to two or more states at time n+1. For each of those paths, a value has to be added to a metric accumulator. The value is a function of the received symbol and the expected symbol along the path. At each state at time n+1, two or more paths merge. The Viterbi decoder algorithm has to select the one with the higher metric, and it has to record this decision. The Viterbi decoder has to make such a decision for each received symbol for each of the 2k−1 states. Often the implementation for these applications is dedicated hardware, application specific integrated circuits (ASICs). These are not only expensive and area consuming, they are also able to process only one specific convolutional code. If the hardware is designed for a code with a two parallel transition per state algorithm it can not implement a cell with four parallel transitions (also sometimes referred to hereinafter as “branches”) per state. If it is designed for a two branch thirty-two state, it cannot operate on a two branch sixty-four state. And so any versatility comes at the cost of additional ASICs for each different Viterbi algorithm. This approach also requires significant bus wiring. For example, a two branch, sixty-four state Viterbi will need sixty-four processing cells and 64×2 buses with each bus being 8, 16, 32, . . . . lines. Thus, for a thirty-two line implementation there is required a bus line capacity of 2×32×64 or 64 time the number of cells. If a four branch Viterbi is implemented the number of lines needed is 4×32×64 or a hundred and twenty eight times the number of cells. If a two branch one hundred and twenty-eight state Viterbi were to be implemented not only the number of bus lines would be increased, but the number of cells, too, would have to be increased to one hundred and twenty-eight. This increases the power consumption as well. See the Elixent reconfigurable ALU array (RAA) at www.elixent.com. See also the XPP architecture at www.PACTCORP.com.