I. Field of the Invention
The present invention relates generally to decoding in communication systems and, more particularly, to add-compare-select (ACS) processing.
II. Description of the Related Art
A major portion of the processing power for third generation wireless communications revolves around trellis-based (“butterfly”) algorithms, such as the log domain maximum a posteriori (log MAP) algorithm or the Viterbi algorithm (VA).
A trellis butterfly calculation defines the interconnectivity between two states in a trellis at a present time and two states in the trellis in a next time period. FIG. 1 shows a slice of a trellis that illustrates a single butterfly. Two input states 10 and 20, at time k, connect to a corresponding pair of states, 30 and 40, at time k+1, via opposing pairs of paths 12, 14 and 22, 24, respectively. Input state 10 corresponds to a path metric for state a at time k, and has two branch metrics 12 and 14. The branch metrics 12, 14 are dependent on parity data, extrinsic information and respective input symbols, 0 and 1. A path metric is a measure of the probability of a particular state based on past received symbols, whilst each branch metric reflects the probability that a current path between two states is correct.
The branch metrics 12,14 connect the input state 10 to possible states in the trellis at time k+1. Branch metric 12 terminates at next state 30, being the path metric for state m at time k+1. Branch metric 14 terminates at next state 40, being the path metric for state n at time k+1. Similarly, input state 20 corresponds to a path metric for state b at time k and has branch metrics 22 and 24, which are dependent on parity data, extrinsic information and respective input symbols of 0 and 1. Branch metric 22 terminates at next state 30 and branch metric 24 terminates at next state 40, at time k+1. Thus, for any given path metric at time k, there are two possible branch metrics, corresponding to input symbols of 0 and 1, leading to two possible new states at time k+1. Moreover, pairs of input states at time k are connected to corresponding pairs of states at time k+1 by opposing branch metrics, demonstrating the symmetry of the trellis.
A core component for implementing such trellis-based algorithms is an add-compare-select (ACS) unit, which approximates trellis state probability calculations in the log domain. The trellis butterfly calculation is performed using two interconnected ACS units, each ACS unit being fed two competing path metrics computed using previous path metrics and current branch metrics. The ACS unit selects the greater of the two competing path metrics as a maximum path metric, which is then normalized and corrected to produce a new path metric. The same technique may be used to select a minimum path metric to produce a new path metric.
As the ACS unit performs a log approximation, hardware implementations of the log MAP algorithm use a lookup table to add a corrective factor, based on the difference of the incoming path metrics, to compensate for the maximum approximation. The operation can be summarized as follows, where PMsx represents the path metric (PM) for state x and BMy represents the branch metric (BM) for path y (either path 0 or 1) at time index k:
            x      1        =                                        PM            s0                    ⁡                      [            k            ]                          +                                            BM              0                        ⁡                          [              k              ]                                ⁢                                          ⁢          and          ⁢                                          ⁢                      x            2                              =                                    PM            s1                    ⁡                      [            k            ]                          +                              BM            1                    ⁡                      [            k            ]                                                  PM        sx            ⁡              [                  k          +          1                ]              =                  max        ⁡                  [                                    x              1                        ,                          x              2                                ]                    +              f        ⁡                  [                                                                x                1                            -                              x                2                                                          ]                    
A traditional method of implementing an add-compare-select (ACS) unit for butterfly processing follows directly from the equation specified above. FIG. 2 demonstrates a typical block diagram for a prior art ACS unit. Initially, two competing paths are computed from the previous path metrics and the current branch metrics using an adder circuit. There are many techniques to accelerate the addition process such as carry-look-ahead adders and prefix adders, but the propagation delay still depends on fully propagating the carry to compute the final result.
For a given time period, path metric-0 201 and a corresponding branch metric-0 202 are presented to a first adder 210 to produce a first competing path 211. Path metric-1 203 and corresponding branch metric-1 204 are presented to a second adder 212 to produce a second competing path 213. The competing paths 211 and 213 are presented to each of a multiplexer 214 and a subtracter 216. The two competing paths 211, 213 are subtracted to determine which of the two competing paths is the maximum path metric value. Accordingly, the subtracter 216 produces a most significant bit 217, which represents the sign of the difference between the two competing path metrics 211, 213 and, thus, which of the two competing path metrics is greater. The most significant bit 217 is presented as a select bit for the multiplexer 214, and the greater of the two competing path metrics is output from the multiplexer 214 as the maximum path metric 215. Alternate embodiments utilize the most significant bit 217 to select a minimum path metric. The subtracter 216 also produces the difference 219, which is presented to a lookup table 218.
The lookup table (LUT) 218 uses the difference 219 to produce a corrective factor 223. The LUT simply approximates the correction factor, which is a function of the absolute value of the difference between the two competing paths:ƒ[|x1−x2|]=ln(1+e|x1−x2|).
The maximum path metric 215 is presented to a third adder 220, which also receives an external normalization factor 222. The normalization factor 222, which is typically a negative value, and the maximum path metric 215 are added to ensure that the maximum value 215 remains within the dynamic range of the ACS unit. Path metric values tend to grow continuously with recursive ACS processing, and the dynamic range of the path metric variables can grow quite large, even for moderate size blocks. Fortunately, the values of path metrics only have meaning relative to the other states within the same time index, so a normalization term is applied to prevent the path metrics from growing too large. The dynamic range of the path metric values is quantized to handle only a small block of trellis, providing the normalization factor is equally applied to all states to periodically reduce the magnitudes of the path metric values.
The third adder 220 produces a normalized output 221, which is added to the corrective factor 223 using a fourth adder 224. The fourth adder 224 produces an output 226, which is the new path metric for the next time period. The critical calculation pipeline for such an algorithm, i.e., the pipeline path that limits the calculation speed, is formed from at least 4 adders in series, or 3 adders and a look-up table (LUT), depending on the propagation delay of the LUT.
Carry-save arithmetic is a known technique in which a result is presented as separate carry and sum components, rather than the more conventional single number resolved output. FIG. 3a shows a known implementation of a 3:2 compressor 300 using a full adder. The 3:2 compressor 300 receives three inputs A-302, B-304 and C-306 and produces a sum 308 and a carry 310. FIG. 3b shows the truth table for the 3:2 compressor 300 of FIG. 3a. It is evident from the truth table 315 that the sum 308 plus twice the carry 310 provides the sum A+B+C.
For example, if one of the three inputs A,B,C is equal to 1, with the other inputs being 0, the carry is 0 and the sum is 1, representing a result of 1. Similarly, if two of the inputs are 1 with the remaining input being 0, the carry is 1 and the sum is 0, yielding a result of 2. Finally, if each of the inputs is 1, the carry is 1 and the sum is 1, representing a result of 3. Thus, the 3:2 compressor 300 is able to represent the values of the three inputs 302, 304 and 306 in the carry-save format using two components 308 and 310.
FIG. 3c shows a known implementation of a 4:2 compressor 320 using two full adders 316, 318 that have been cascaded. The 4:2 compressor 320 receives inputs 322, 324, 326 and 328, along with a carry-in 329. The 4:2 compressor 320 produces sum and carry outputs 330 and 332, respectively, and a carry-out 327. Three inputs 322, 324 and 326 are presented to the first full adder 316. The first full adder 316 produces a sum 325 and the carry-out 327. The sum 325 and the fourth input 328 are presented to the second full adder 318, along with the carry-in 329. The carry-out 327 is decoupled from the carry chain and is presented as an output of the 4:2 compressor 320. The carry-out 327 may be used as a carry-in for a cascaded 4:2 compressor. The second full adder 318 adds the sum 325 and the fourth input 328, utilizing the carry-in 329, to produce the sum 330 and the carry 332. The sum 330 and carry 332 represent the sum of the four inputs 322, 324, 326 and 328. Decoupling the carry chain results in the carry-out 327 of the 4:2 compressor 320 being independent of the carry-in 329. Thus, the carry-out 327 is dependent only on the three inputs 322, 324 and 326, resulting in a faster embodiment of a 4:2 compressor 320.
Since the trellis butterfly calculation is placed on the tight inner loop of trellis algorithms, the overall performance of the trellis butterfly calculation dictates the critical path, i.e., the path that limits the calculation speed. Consequently, every effort spent on optimizing the ACS unit will translate directly to performance gains in the trellis processing algorithm. For example, the correction factor term in log MAP is an essential component of the ACS unit because it has a significant impact on algorithm performance, and in the case of turbo decoding, the correction factor contributes a 0.3 dB performance gain over the max-log MAP algorithm.