1. Field of the Invention
The invention relates to turbo decoding methods and systems being adapted for execution of these turbo decoding methods.
2. Description of the Related Technology
Turbo encoding is characterized in that the to be encoded signal uk is encoded by a first encoder ENC1, resulting in a first encoded signal and an interleaved version of uk is encoded by a second encoder ENC2, resulting in a second encoded signal. The original to be encoded signal uk, the first and second encoded signal are then transmitted. In the transmitted signal one can thus distinguish a sequence of blocks, being said to be encoded signal uk, said first and second encoded signal. FIG. 4(a) shows such an encoder set-up. One can state that the complete encoder transforms the original input bits uk in output symbols ck, comprising of a concatenation of convolutional component codes, being separated by a pseudo random interleaver. It is important to note that in turbo coding reinitialisation of the state of the encoder is essential for the corresponding decoding procedure. Often also part of the input sequence to such an encoder is adapted such that the end state of the encoder is forced to be a particular known state. Because of the fact that the original signal is transmitted uncoded also, one denotes such a coding technique also as a systematic code. Note that a pseudo random interleaver has a mapping function having a substantially irregular pattern.
Turbo decoding is characterized in that the received transmitted code is decoded by a first decoder, resulting in a first decoded signal. Turbo decoding does not stop however after determining said first decoded signal. Instead a second decoder does another decoding step, resulting in a second decoded signal, exploiting the received transmitted code and an interleaved version of the first decoded signal. Afterwards said first decoder performs a further decoding step, exploiting said received transmitted code and an interleaved version of said second decoded signal. This procedure is iteratively and then the final decoded signal is determined. Said first and second decoded signal can be denoted intermediate data elements, as these are not the final decoded signal. Turbo decoding can be described as performing iteratively decoding methods, embedded in the decoders. The decoding method of a first decoder exploits in an iteration data generated by another decoder in the previous iteration. FIG. 4(b) shows such a turbo decoding set-up. Said intermediate data elements are denoded also extrinsic information or a posteriori information. The data element determined in a previous iteration is used as intrinsic information in a next iteration. It is expected that performing said iteration results in intermediate data elements being approximations of a maximum likelihood estimate of the original signal uk. Because said first and second decoder exploit signals from each other, only one of said decoders is active at a time, meaning that half of the hardware required in a turbo decoding architecture is idle while performing turbo decoding. Known approaches for solving this is using pipelining, meaning that a sequence of hardware blocks is used for turbo decoding. A first block performs first decoding, a second block performs second decoding, a third block performs again said first decoding and so on. While said third block performs its first decoding step said first block can already start executing its first decoding step on a new sequence. Naturally such an approach requires a lot of hardware.
Reference to turbo decoding and decoding methods used therein are found in: [D. Garrett, M. Stan, xe2x80x9cLow Power Architecture of the Soft-Output Viterbi Algorithmxe2x80x9d, Proceedings International Symposium on Low Power Electronics and Design (ISLPED""98), Monterey, Calif., Aug. 10-12, 1998, pp. 262-267][O. J. Joeressen, M. Vaupel, H. Meyr, xe2x80x9cHigh-Speed VLSI Architectures for Soft-Output Viterbi Decodingxe2x80x9d, Journal of VLSI Signal Processing, 1-12, 1998]. [C. Berrou, A. Glavieux, P. Thitimajshima, xe2x80x9cNear Shannon limit error-correcting coding and decoding: Turbo-codes,xe2x80x9d Proc. ICC""93, Geneva, Switzerland, May 1993, pp. 1064-1070]. [S. S. Pietrobon, xe2x80x9cEfficient Implementation of Continuous MAP Decoders and a Synchronisation Technique for Turbo Decodersxe2x80x9d, Int. Symp. on Inform. Theory and its Applications, Victoria, BC, Canada, Sep. 1996, pp. 586-589].
The decoding methods used by the decoders within said turbo decoding set-up are now described shortly. In particular Maximum A Posteriori approaches are discussed. The log-SISO algorithm is chosen as specific algorithm for the description although the invention is not limited hereto. E.g. also Soft-output Viterbi Algorithms can be used. By operating in the logarithmic domain expensive multiplications are avoided. Instead the E-operation is introduced, which can easily be implemented using table look up or approximated by taking the maximum. The extrinsic information xcexkext is calculated based on xcex1 and xcex2 state metrics as indicated in formula 1 in which c1 and C2 are the output bits for an encoder state transition from s to sxe2x80x2 (FIG. 17).       λ    k    ext    =                    E                                            x              i                        =            1                    ,                      s            →                          s              xe2x80x2                                          ⁡              [                              δ            k                    ⁢                      (                          s              ,                              s                xe2x80x2                                      )                          ]              -                  E                                            x              i                        =            0                    ,                      s            →                          s              xe2x80x2                                          ⁡              [                              δ            k                    ⁢                      (                          s              ,                              s                xe2x80x2                                      )                          ]            
with xcex4k(s,sxe2x80x2)=xcex1k(s)+xcex2k(sxe2x80x2)+c1xc2x7xcexk1+c2xc2x7xcexk2 
The log likelihood ratios xcexki (for i=1 . . . 2) of the channel symbols yki are defined as:       λ    k    i    =      log    ⁡          [                        P          ⁢                      (                                          c                k                i                            =                              1                |                                  y                  k                  i                                                      )                                    P          ⁢                      (                                          c                k                i                            =                              0                |                                  y                  k                  i                                                      )                              ]      
After some iterations the decoded bits ûk are calculated as (xcexkint is the intrinsic information):
ûk=sign [xcexkint+xcexkext+xcexk1
The xcex1 and xcex2 metrics are obtained through formula 3 and 4 based on a forward recursion and a backward recursion respectively. They both start in a known initial state at the beginning (for xcex1) or end (for xcex2) of the block.                                           α                          k              +              1                                ⁡                      (                          s              xe2x80x2                        )                          =                              E                          s              →                              s                xe2x80x2                                              ⁡                      [                                                            α                  k                                ⁡                                  (                  s                  )                                            +                                                c                  1                                ·                                  λ                  k                  int                                            +                                                c                  1                                ·                                  λ                  k                  1                                            +                                                c                  2                                ·                                  λ                  k                  2                                                      ]                                                                        β                          k              -              1                                ⁡                      (            s            )                          =                              E                          s              →                              s                xe2x80x2                                              ⁡                      [                                                            β                  k                                ⁡                                  (                                      s                    xe2x80x2                                    )                                            +                                                c                  1                                ·                                  λ                  k                  int                                            +                                                c                  1                                ·                                  λ                  k                  1                                            +                                                c                  2                                ·                                  λ                  k                  2                                                      ]                              
In general in these MAP algorithms a computing step and a determining step can be distinguished. Said compute step is characterized by the computation of two vector sequences or state metrics. Said vector sequences are computed via recursions. A forward recursion for determining said first state metrics and a backward recursion for determining said second state metrics are distinguished. Said state metric determination exploits the encoded signal (via xcexk1, xcexk2) and intermediate data elements xcexINT, produced by another decoder. Said decoded signal uk is being determined by combining said encoded signal (via xcexk1), said first state metrics and said second state metrics (via xcexEXT). Note that the coding process can be seen as a transition in a finite state machine, wherein the register content of the convolution coder, denotes the state of the encoder, which completely determines the behaviour of the coder for the next input bit. One often represents this with a trellis graph, showing state transitions. The state metrics exploited in the decoding process refer in principle back to these encoder state.
The xcex1 metrics need to be stored however since the first xcexext can only be calculated once the entire forward recursion is finished. This results is a storage of N metrics for all the states, which is unacceptable for most practical interleaver sizes N. A solution to the storage requirement problem of the normal SISO algorithm presented above, is the introduction of sliding windows [S. S. Pietrobon, xe2x80x9cEfficient Implementation of Continuous MAP Decoders and a Synchronisation Technique for Turbo Decodersxe2x80x9d, Int. Symp. on Inform. Theory and its Applications, Victoria, BC, Canada, Sep. 1996, pp. 586-589]. The xcex2 state metrics are not initialized at the end of the block, but at some point k (see FIG. 19). After the backward recursion over window size L time steps the metrics provide an accurate approximation at time k-L. The next metrics xcexk-L through xcexkxe2x88x922L are calculated and used to produce the extrinsic values. The window is then shifted by a value L. This algorithm requires the storage of only L xcex1 metrics.
The use of overlapping windows, also denoted sliding windows, comprises the computation of one of said state metrics, with its corresponding recursion being validly initialized, while the other state metric is then determined a plurality of times but each time only part of said state metrics are determined and the recursions used therefore are not validly initialized. Recursions wherein only part of said state metrics and which are not validly initialized are further denoted restricted recursions. The overlapping window approach can then be described as a method wherein one of said state metrics is determined completely with a validly initialized recursion while the other state metric is then determined a plurality of times with restricted recursion, determining only part of these state metrics. It should be emphasized that although so-called invalid initializations are used, the turbo decoding schemes show the property that after some recursion steps, said computed state metrics converge towards the state metrics expected when valid initializations were exploited. In so-called overlapping window approaches described above either one of said state metrics is determined completely and with a valid initialized recursions.
The sliding windows approach cures the memory requirement problems only partially. An important problem with turbo decoding, either in a standard way or via overlapping windows, is the long latency, due to the intrinsic iterations in combination with the long recursions.
In the invention aspects related to the overall turbo decoding approach and aspects related to the particular decoding approach used within such a turbo decoding approach can be distinguished. It should be emphasized that, although the decoding approaches aspects of the invention are situated within the overall turbo decoding approach in the description, the contribution of each of these aspects should be recognized in general.
Turbo decoding schemes can be characterized as methods for determining a decoded signal from an encoded signal, being encoded by a turbo encoding scheme. In such a turbo decoding scheme, besides a step of inputting or entering said encoded signal, a compute step and a determining step can be distinguished. Said determining step can be simultaneous, partly overlapping or after said with said compute step. In turbo decoding schemes said compute step is characterized by the computation of two vector sequences. One vector of such a sequence is denoted a state metric. Therefore in said compute step state metrics are determined. With first state metrics is meant a first vector sequence. With second state metrics is meant a second vector sequence. Said vector sequences are computed via recursions. With recursion is meant that the following vector in such a vector sequence is determined by at least the previous vector in said sequence. In turbo decoding schemes a forward recursion for determining said first state metrics and a backward recursion for determining said second state metrics are distinguished. The terms forward and backward refer to the order in which said encoded signal is inputted. Turbo decoding schemes are characterized by the fact that said decoded signal is being determined by combining said encoded signal, said first state metrics and said second state metrics.
In a first aspect of the invention particular ways of storing said vector sequences or state metrics in memories are presented. Indeed when one wants to determine said decoded signal from said encoded signal and said state metrics, and when said state metrics are not produced or computed at the same time as these state metrics are needed for consumption, storage of already computed state metrics is a possible way to go. As said vector sequences, exploited in turbo decoding schemes, are typically long, large memories, being power consumptive and long access times, are then needed. As low power implementation and low latency of turbo decoding schemes is aimed at in the invention, an alternative approach is presented. After inputting said encoded signal, said first state metrics is determined by a forward recursion. Said forward recursion is properly initialized. Said forward recursion exploits said inputted encoded signal. However not all said computed first state metrics or vectors are stored in a memory. Note that all said first state metrics should be computed one after another due to the forward recursion approach. In the invented approach however only part of said necessarily computed first state metrics is stored in a memory, denoted a first memory. With storing part of said computed first state metrics is meant that the amount of stored values is less than the total size or length of said vector sequence or first state metrics. In practice it is meant that an amount of stored values being substantially less than the total length is stored. After computing said first state metrics and stored part of them, said second state metrics is computed with a backward recursion. When a particular state metrics of said backward determined state metrics becomes available, it can be almost directly exploited for determining said decoded signal from said encoded signal, said second state metrics and said computed first state metrics. Said second state metrics thus does not need a large buffer memory as its consumption is scheduled near in time to its production. Said invented approach can be characterized in that only a part of said computed first state metrics is stored in a first memory, more in particular, in a memory being substantially smaller than the size or length of said first state metric sequence.
In an embodiment of this first aspect of the invention said decoded signal is determined by exploiting a calculation step, wherein said decoded signal is determined directly form said encoded signal, said second state metric and said computed first state metrics, being stored in said first memory.
In another embodiment of this first aspect of the invention said decoding signal is determined from said encoded signal, said second state metrics and the needed first state metrics, itself being determined or recomputed. With said recompilation is not meant that said first state metrics is determined all over again, starting from the initialization. With recomputation is meant that first state metrics, being computed before but not stored, are recomputed from first state metrics, being computed and stored. More in particular first state metrics, not stored and lying in between or intermediate stored first state metrics are determined from the first state metrics, bounding the sequence of not-stored ones. One can then state that in said decoded signal determining step explicitly, recomputed first state metrics are used.
In a further embodiment of this first aspect of the invention wherein said non-stored first state metrics are recomputed one does not necessarily consume said recomputed values directly when producing or recomputing them. Further it is not necessarily that said non-stored first state metrics are recomputed several times when needed. Indeed such unnecessary recomputation or direct production-consumption restriction can be circumvented by at least partly storing said recomputed values in a second memory. In practice said recomputed values will be stored only temporary in said second memory. The size of said second memory will be substantially less than the size of said first state metric sequence. In an embodiment of the invention said second memory size will be equal or even less than the amount of non-stored first state metrics in between said stored ones. Said second memory can thus contain at most said intermediate first state metrics. The size constraint on said second memory, results in overwriting said stored recomputed values.
Alternatively one can state that in the invention instead of storing the xcex2 metrics, being determined with a backward recursion, for all time steps k, only some are stored and the missing ones are recalculated when they are needed to compute xcexout. When we store only 1/xcex8 of the backward state metrics, this means that only {xcex2i(S0), xcex2i(S1), . . . , xcex2i(Sn)}, {xcex2i+xcex8(S0), xcex2i+xcex8(S1), . . . , xcex2i+xcex8(Sn)}, {xcex2i+2xcex8(S0), xcex2i+2xcex8(S1),. . . , xcex2i+2xcex8(Sn)}, . . . are stored in memory. The parameter xcex8 is determined by simulations, taking into account the architecture on which the algorithm should be implemented, the appropriate power and area models and the cost criterion, being area, energy consumption and latency. Note that alternatively the same approach can be used for the state metric being determined with a forward recursion.
In a second aspect of the invention, particular ways of executing said state metric recursions and said decoded signal determination step are presented aimed at providing methods wherein a trade-off between latency and memory occupation can be made. In a traditional execution of a turbo decoding scheme one computes via a forward recursion said first state metrics, being validly or correctly initialized, and then one computes said second state metrics, being validly initialized, with a backward recursion. Simultaneously or after said second state metric computation, one determines said decoded signal. Executing turbo decoding schemes in such a way results in a long latency and huge memories for storage of said state metrics and said inputted encoded signal. Another approach, denoted also as the use of overlapping windows, comprises the computation of one of said state metrics, with its corresponding recursion being validly initialized, while the other state metric is then determined a plurality of times but each time only part of said state metrics is determined and the recursions used therefore are not validly initialized. Recursions wherein only part of said state metrics is determined and which are not validly initialized are further denoted restricted recursions. The overlapping window approach can then be described as a method wherein one of said state metrics is determined completely with a validly initialized recursion while the other state metric is then determined a plurality of times with restricted recursion, determining only part of these state metrics. It should be emphasized that although so-called invalid initializations are used, the turbo decoding schemes show the property that after some recursion steps, said computed state metrics converge towards the state metrics expected when valid initializations were exploited. In so-called overlapping window approaches described above, either one of said state metrics is determined completely and with a valid initialized recursions. Such approaches still show a long latency. In the invention a method for turbo decoding is presented wherein for both state metrics (first and second) only part of these state metrics is determined with a validly initialized recursion, while the other parts, thus the one not determined by a validly initialized recursion, are being determined by restricted recursions. More in particular a plurality of said restricted recursions are needed. Execution of part of said restricted recursions are performed at the same time. The invented approach is further denoted a double flow approach.
In an embodiment of this second aspect of the invention, the decoded signal determining step is performed while executing said validly initialized recursions. An example is given now, but it should be clear that the role of said first and said second state metric can be reversed. In such approach, one starts a first restricted recursion for determining with a backward recursion part of said second state metric. After some recursion steps, valid values of said second state metric are obtained. When the second state metric to be determined by said restricted recursion are all determined, one starts computing said first state metric, with said validly initialized forward recursion and one consumes substantially simultaneously the computed first state metric, the already determined second state metric, if valid, and the encoded signal, in order to determine the decoded signal. After a while no valid second state metrics from said first restricted recursion are available. Therefore a second restricted recursion, being started already, is now providing said second valid state metrics. The first state metric is still provided by said validly initialized first state metric. The same approach is used for said second state metric. A validly initialized backward recursion is started, supported with restricted recursion of said first state metrics. It should be emphasized that said validly initialized recursions, with their supporting restricted recursions, are essentially dealing with other parts of said state metrics. Said validly initialized recursions stop when they reach same point in the vector sequence of state metrics, indicating that the full range of state metrics is covered.
In another embodiment of the invention of the second aspect of the invention said decoding signal determining step is performed while executing said non-validly initialized recursions. An example is given now, but it should be clear that the role of said first and said second state metric can be reversed. In such approach one starts a first restricted recursion for determining with a backward recursion part of said second state metric. Simultaneously one starts computing said first state metric, with said validly initialized forward recursion but the computed first state metric values are not directly consumed for determining said decoded signal. After some recursion steps valid values of said second state metric are obtained. Then one further determines with said restricted recursion second state metrics and one consumes substantially simultaneously the already computed first state metric, the further determined second state metric and the encoded signal, in order to determine the decoded signal. The first state metrics are also further determined with said forward recursion. After a while the part of the decoded signal that can be determined from said second and first state metrics is found. Therefore a second restricted recursion, being started already, is now providing said second valid state metrics, while the first state metrics were already determined by said continuing forward recursion. The same approach is used for said second state metric. A validly backward recursion is started, supported with restricted recursion of said first state metrics. It should be emphasized that said validly initialized recursions, with their supporting restricted recursions, are essentially dealing with other parts of said state metrics. Said validly initialized recursions stop in this case not when they reach the same point in the vector sequence but soon thereafter. Indeed some part of the decoded signal still has to be determined. In this last phase the validly initialized recursion deliver the state metrics to be consumed for decoded signal determination.
In another embodiment of the invention of the second aspect of the invention said decoded signal determining step is being performed partly while executing part of said non-validly initialized recursion and partly while executing said validly initialized recursion. This embodiment can be seen as a mixed form of the two embodiment described above. Again two validly initialized recursions, one for each state metric, are executed. Each of said validly initialized recursions, is being supported by so-called restricted recursions of the complementary state metric. Determining a value of the decoded signal is possible when both corresponding state metrics are available and of course valid. In this embodiment the first restricted recursions are started a substantial time before said validly initialized recursions begun.
In a third aspect of the invention particular methods for turbo decoding, comprising essentially of turbo decoding method steps of smaller sizes are presented. A turbo decoding method determines in principle a decoded signal by combining an encoded signal, first state metrics and second state metrics. Said metrics are computed via a forward or a backward recursion. Said recursions in principle need to be validly initialized. However because even invalidly initialized recursions converge after some dummy state metric computations towards valid values of state metrics, one can work with invalidly initialized recursions also, as pointed out in the embodiments discussed above. When for both recursions invalid initializations are used, one can split up the turbo decoding algorithm in a plurality of turbo decoding method steps of smaller sizes, meaning that the vector sequences or state metrics in such methods are of smaller length. In principle said state metrics of such method of smaller size together define the full state metrics, except that also some dummy metrics are computed. It can be stated that the turbo decoding method comprises of executing a plurality of compute and determining steps. Each of said compute and determining steps comprising of computing part of said first state metrics with a forward recursion and part of said second state metrics with a backward recursion, and determining part of said decoded signal by combining part of said encoded signal, part of said first state metric and part of said second state metric. Although said compute and determining steps can be seen as totally separate methods, in the invention particular scheduling of these compute and determining steps are proposed such methods support each other, by providing valid initializations to each other.
In an embodiment of this third aspect of the invention at least two of these compute and determining steps are executed or scheduled such that these are performed partially overlapping in time.
In another embodiment of this third aspect of the invention at least two of said compute and determining steps are being scheduled such that initialization of one of said recursions for said computing state metrics of one of said compute and determining steps is being based on a computed value of said recursion of said state metrics in the other compute and determining step.
In the fourth aspect of the invention a method for iterative decoding is presented. Note that iterative decoding is typically used in turbo decoding. In principle the presented iterative decoding method can be exploited in a context different of turbo decoding also. Turbo decoding is characterized by performing a first decoding step followed by a second decoding step, and iteratively performing these two decoding steps. After loading or inputting the encoded signal, a first decoding step is performed, which produces or determines a first intermediate data element. The second decoding step exploits an interleaved version of said first intermediate data element in order to produce or determine a second intermediate data element. With interleaved version is meant that essential the same values are stored in said data element but the ordering within said data element, being a vector, is changed. After this both said first and said second decoding step are performed iteratively but never simultaneously until from one of said data elements said encoded signal can be deduced. Note that this procedure implies that the hardware performing said first decoding step and the hardware performing said second decoding step are not active simultaneously, thus an inefficient use of the overall hardware is obtained. Note that a standard turbo decoding set-up, as depicted in FIG. 4(b), requires two memories as interleavers. In the invention an iterative decoding procedure, which can be used in turbo decoding but is not limited thereto, enabling more efficient use of hardware, is presented. The invented method is based performing said first and second decoding step on the same hardware, said first and second decoding step exploiting the same single-port memory. Said hardware and said single-port memory are used in a feedback configuration. Note that the determining of the encoded signal by the decoding methods exploited in said iterative procedure can be denoted in this context soft decisions as in fact only an intermediate data element is determined. It is in the final step of the iterative procedure, when actually determining the best estimate of the decoded signal, that a hard final decision is taken.
In an embodiment of this aspect of the invention one recognizes that by considering said iterative performing said first decoding step and second decoding step together as a method step and applying said method step only on part of said encoded signal, then this method step will be smaller in size, thus needing smaller memories. The turbo decoding procedure can then be done by performing a plurality of such smaller size method steps, each comprising of iteratively performing first and second decoding steps, substantially simultaneously. Each of such smaller size method steps then has a memory, being assigned to it, and used for storage of the intermediate data elements, produced by it. Said memories are of the single-port type.
In a further embodiment of this aspect one recognizes the use of a flexible address generator. Indeed as said single-port type memory or storage unit should show interleaver functionality, flexible generation of addresses for storage in and retrieval out said memory is needed.
In a further embodiment of this aspect of the invention one considers further iterative performing of a plurality of decoding steps instead of said first decoding step and second decoding step only.
In another embodiment of this fourth aspect of the invention one recognizes that by considering said iterative performing said first decoding step and second decoding step together as a method step and by executing a plurality of such methods substantially simultaneously, one can decoded more than one sequence of blocks, related to said encoded signal, at a time, which again increases the efficiency of the hardware used. Naturally such an approach is not limited to two decoding steps but can be extended to a plurality of decoding steps.
In a fifth aspect of the invention essentially digital devices, being capable of executing the turbo decoding methods described above, are presented. Said digital devices are adapted such that they can either use said partial storage and recomputation based method and/or implement said so-called double flow methods and/or exploit parallelism and/or comprise of the hardware-memory feedback configurations discussed before.
In a sixth aspect of the invention parametric optimization of turbo decoding methods, being particularly adapted for providing sufficient degrees of freedom, is presented.
One embodiment of the invention provides methods for turbo decoding which is low power consuming, which has reduced memory requirements and shows enhanced performance with respect to the latency problem. Further embodiments being are also shown for execution of such low power consuming, less memory requiring methods with lower latency.