1. Field of the Invention
The present invention relates generally to an iterative multi-input multi-output (MIMO) receiver, and more specifically, to a MIMO receiver that uses group-wise demapping.
2. Description of Related Art
For multiple-input multiple-output (MIMO) systems, space-time bit-interleaved coded modulation (STBICM) has been recognized as a way for achieving high-rate wireless communications with near capacity performance. Referring to FIG. 1A there is shown a functional architecture of an STBICM MIMO system 100 of the prior art. The system includes a transmitter 110 that encodes an information/data bit sequence u from a binary source 111 and transmits the bit sequence from a plurality (two or more) of transmit elements Nt over a wireless channel 120 to a receiver 130. Receiver 130 includes a plurality of receive elements Nr (where Nr may or may not equal Nt) that receive the transmitted information from transmitter 110. Thereafter, receiver 130 recovers/decodes the bit sequence u and transfers the bit sequence to a binary sink 142.
Referring to FIG. 1B, there is shown a functional architecture of a transmitter 110 of the prior art. Transmitter 110 includes an outer encoder 112, a bit interleaver 114, a demultiplexer 116, and a mapper (inner encoder) 118. In operation, the information bit sequence u having a length L is first forward to outer encoder 112 where the bits are encoded using an error correcting code of rate R to yield a coded bit sequence c2 of length L/R. Encoder 112 may be, for example, a Turbo encoder.
The coded bit sequence c2 is next forwarded to bit interleaver 114, which bit interleaves c2, thereby resulting in an interleaved bit sequence c1. Thereafter, the interleaved coded bit sequence is forwarded to demultiplexer 116, which splits the bit sequence c1 into Nt parallel and independent bit streams d1 . . . dNt for example, each of which is assigned to a unique transmit element from among the Nt transmit elements for transmission. One skilled in the art will recognize that it is not necessary that each bit stream be assigned to a unique transmit element, this simplification being assumed only for ease of description. For example, each transmit element may transmit some linear combination of multiple streams, as is the case when using space-time codes.
Bit streams d1 . . . dNt are next forwarded to mapper 118. For a given channel use, mapper 118 splits each bit stream into a block of M bits, maps each block to a complex symbol, and then simultaneously transmits each symbol over channel 120. More specifically, for each channel use, the bit streams d1 . . . dNt can be denoted as a bit vector x=[x1, . . . , xNt]T of size NtM×1 with xi=[xi,1, . . . xi,M] for i=1 to Nt. Each block of M bits for each stream is mapped onto a symbol si=map(xi) for i=1 to Nt, where the symbols si are chosen from a complex constellation of size 2M and alphabet A={a1, . . . , a2m−1}. Thereafter, each transmit element simultaneously transmits a corresponding symbol over channel 120 towards receiver 130. The collection of all Nt simultaneously transmitted symbols can be denoted by the vector s=[s1, . . . , sNt]T.
At receiver 130, each of the Nr receive elements receives the symbol stream radiated by each of the Nt transmit elements. During each channel use, the symbol streams received at the receive elements may be denoted as a signal vector y of size Nr×1. As is known in the art, channel 120 may be represented as a Nr×Nt channel matrix H where the ijth element of the matrix represents the channel gain between the jth transmit element and the ith receive element. For ease of description, channel 120 is assumed to be flat (frequency non-selective) with Rician-fading and unity gain for each channel coefficient. Nonetheless, one skilled in the art will recognize that when channel 120 is a frequency selective channel, an effective flat-channel may be realized by incorporating an orthogonal frequency division multiplexing (OFDM) modulator and demodulator into transmitter 110 and receiver 130, respectively. One skilled in the art will also recognize that receiver 130 may use standard channel estimation methods to determine channel matrix H. For ease of description, it is assumed that channel matrix H is perfectly known by receiver 130.
Accordingly, vector y at receiver 130 may be given asy=Hs+n   (1)where n represents an additive white noise vector whose elements are complex Gaussian with zero-mean and variance σn2=N0/2 per real dimension. The average symbol energy per stream E{|si|2} may be denoted by Es. Accordingly, it follows that the average signal-to-noise ratio per receive element is SNR=NtEs/(2σn2).
Referring now to FIG. 1C, there is shown a functional architecture of a receiver 130 of the prior art. Receiver 130 includes a demapper (inner decoder) 132, an outer soft-input soft-output (SISO) decoder 136, a deinterleaver 134, and an interleaver 138. As illustrated, the demapper 132 and decoder 136 are interconnected in a loop and function in an iterative fashion to reconstruct from signal vector y the information bit sequence u transmitted by transmitter 110. Specifically, during the first pass through receiver 130, demapper 132 takes the observation y and knowledge of the channel H and demaps the Nt received complex symbol streams back to the constituent NtM coded bits by determining soft information for each of the coded bits. In particular, demapper 132 computes the a posteriori probability (APP) log-likelihood ratio (LLR) values for the coded bits. The collection of these LLR values for the coded bits is represented by LD1 in FIG. 1C.
Next, the soft information LD1 is forwarded to deinterleaver 134, which deinterleaves the LLR values, thereby resulting in a sequence of LLR values that correspond to the coded bit sequence c2 (here it is assumed that sufficient symbols have been received and demapped to produce a sequence of L/R LLR values). These deinterleaved LLR values become an a priori input LA2 to SISO decoder 136.
SISO decoder 136 further refines the LLR values given its knowledge of the temporal coupling of the bits and produces soft information for the information bit sequence u and the coded bit sequence c2 by computing a posteriori information of the information bits (represented as L′D2 in FIG. 1C) and the coded bits (represented as LD2 in FIG. 1C). As an example, SISO decoder 136 may be implemented using the BCJR or log-MAP algorithm, as described by P. Robertson et al., in “A comparison of optimal and sub-optimal MAP decoding algorithms operating in the log domain,” Proc. Int. Conf. Communications, June 1995, pp. 1009-1013.
The a posteriori information L′D2 from SISO decoder 136 is forwarded to hard decision module 140, which uses the LLR values to determine the information bit sequence u. In turn, the a priori information LA2 is subtracted (through module 141) from the a posteriori information LD2 to produce new (and hence, extrinsic) information LE2. Note that the removal of the a priori part LA2 minimizes the correlation from previously computed values.
The extrinsic information LE2 is next forwarded to interleaver 138, which interleaves the LLR values, thereby resulting in a sequence of LLR values that correspond to the coded bit sequence c1. These interleaved LLR values become a priori information LA1 to demapper 132 (with the demapper operating on NtM of the LLR values). This cycle of detection, decoding, and feedback constitutes the first iteration through receiver 130. Note that in subsequent iterations, the a priori information LA1 is subtracted (through module 142) from the a posteriori information LD1 from demapper 132 to produce new/extrinsic information LE2, which is subsequently forwarded to deinterleaver 134/SISO decoder 136.
In general, each iteration through receiver 130 improves the reliability of the soft-information produced by demapper 132 and SISO decoder 136. The exchange of soft-information between these modules continues until a desired bit-error-rate (BER) performance is achieved. At this point, a final decision is made by hard decision module 140, which uses the a posteriori information L′D2 to determine information bit sequence u, with the module deciding a “1” if the LLR value L′D2≧0 and a “0” otherwise.
Referring now more specifically to demapper 132, prior systems have implemented this demapper as a joint-stream demapper that computes the LLR values for the NtM coded bits transmitted in a given channel use over all Nt transmitted symbol streams. Specifically, given the observation y, prior systems have defined the LLR value of xn,m, which is the mth (m=1, . . . ,M) bit of the nth (n=1, . . . ,Nt) stream, asL(xn,m)=ln(P(xn,m=+1|y)/P(xn,m=−1|y))   (2)Using standard LLR manipulations and the max-log approximation, these systems have computed the extrinsic LLR value of xn,m as
                                          L                          E              ⁢                                                          ⁢              1                                ⁡                      (                          x                              n                ,                m                                      )                          ≈                                            max                              x                ∈                                  X                                      n                    ,                    m                    ,                                          +                      1                                                                                            ⁢                                          1                2                            ⁢                              {                                                      -                                                                                                                                                  y                            -                                                          Hs                              ⁡                                                              (                                x                                )                                                                                                                                                              2                                                                    σ                        n                        2                                                                              +                                                            x                                              [                                                  n                          ,                          m                                                ]                                            T                                        ·                                          L                                                                        A                          ⁢                                                                                                          ⁢                          1                                                ,                                                  [                                                      n                            ,                            m                                                    ]                                                                                                                    }                                              -                                    max                              x                ∈                                  X                                      n                    ,                    m                    ,                                          -                      1                                                                                            ⁢                                          1                2                            ⁢                              {                                                      -                                                                                                                                                  y                            -                                                          Hs                              ⁡                                                              (                                x                                )                                                                                                                                                              2                                                                    σ                        n                        2                                                                              +                                                            x                                              [                                                  n                          ,                          m                                                ]                                            T                                        ·                                          L                                                                        A                          ⁢                                                                                                          ⁢                          1                                                ,                                                  [                                                      n                            ,                            m                                                    ]                                                                                                                    }                                                                        (        3        )            where Xn,m,b denotes the set of bit vectors x whose mth bit value of the nth stream equals b (i.e., +1 or −1), x[n,m] is the subvector of x omitting the element corresponding to the mth bit of the nth stream, and LA1,[n,m] is a vector containing the a priori information corresponding to the entries in x[n,m]. In equation (3), s(x) denotes the mapping from the NtM×1 bit vector x to an Nt×1 symbol vector.
Significantly, as can be seen from equation (3), the per-bit LLR values are computed by considering all possible realizations of the Nt simultaneously transmitted symbols. Consequently, the complexity of the computation is exponential in the product of the number of simultaneously transmitted streams Nt and the bits per symbol M. In other words, for each bit position, the LLR computation requires hypothesizing over 2MNt bit vectors. This exponential complexity makes demapper 132 prohibitive to practical implementation for high spectral efficiency MIMO systems. For example, in a MIMO system transmitting eight parallel symbol streams using a 16-QAM constellation, computation of the per-bit LLR values requires evaluation of 232 (≈4×109) possible symbol vectors, which is prohibitive to practical implementation using current silicon technology.
To manage this complexity, others have proposed implementing demapper 132 as an approximate joint-stream demapper using sphere detection (e.g., see Hochwald et al., “Achieving near-capacity on a multiple-element channel,” IEEE Trans. Commun., vol. 51, no. 3, pp. 389-399, March 2003). The sphere detector reduces complexity by limiting the hypothesis testing to candidates within a hyper-sphere of a certain radius about the received signal. Specifically, the number of NtM×1 bit vectors considered are limited to a specified number of points that are within a certain radius R of the received signal vector y in the maximum-likelihood sense. In other words, only those points that are within the radius R are considered in the evaluation of equation (3). Accordingly, the radius R of the hyper-sphere controls the complexity and performance of the sphere detector.
However, notwithstanding the complexity reduction with respect to the full-search demapper as describe above, the complexity of the sphere detector is still exponential. Further, the complexity of the sphere detector is sensitive to the signal-to-noise ratio and Nr, the number of receive elements, when Nr is less than Nt, the number of transmitted streams. Specifically, the complexity increases as either of these quantities decreases. The complexity increase is especially significant for regimes where Nr is less than the number of transmitted streams.
To further reduce the complexity of demapper 132 in order to address high-rate near-capacity performing MIMO systems, others have proposed implementing the demapper as a set of Nt single stream demappers, each of which demaps one of the Nt symbol streams. In general, each single stream demapper exploits soft-information to perform cancellation and spatial-filtering to remove from the received signal vector y contributions of all streams other than the stream of interest, and then demaps this stream. As result, the complexity of demapper 132 is polynomial in the number of streams Nt.
Notably, from a performance perspective, for a critically loaded MIMO configuration (i.e., the number of transmitted streams equals the number of receive elements) operating in a low-correlation channel, the single stream demappers have been found to be comparable to the joint-stream demappers. However, as the channel becomes more correlated and/or as the number of receive elements used for stream separation becomes less than the number of transmitted streams, performance of the single stream demappers begins to depart from that of the joint-stream demappers. Nonetheless, for high rate systems under these conditions, the joint-stream demappers are not practically feasible, as described above.