As shown in FIG. 1, a wireless communication system 10 comprises elements such as client terminal or mobile station 12 and base stations 14. Other network devices which may be employed, such as a mobile switching center, are not shown. In some wireless communication systems there may be only one base station and many client terminals while in some other communication systems such as cellular wireless communication systems there are multiple base stations and a large number of client terminals communicating with each base station.
As illustrated, the communication path from the base station (BS) to the client terminal direction is referred to herein as the downlink (DL) and the communication path from the client terminal to the base station direction is referred to herein as the uplink (UL). In some wireless communication systems the client terminal or mobile station (MS) communicates with the BS in both DL and UL directions. For instance, this is the case in cellular telephone systems. In other wireless communication systems the client terminal communicates with the base stations in only one direction, usually the DL. This may occur in applications such as paging.
The base station to which the client terminal is communicating with is referred as the serving base station. In some wireless communication systems the serving base station is normally referred as the serving cell. The terms base station and a cell may be used interchangeably herein. In general, the cells that are in the vicinity of the serving cell are called neighbor cells. Similarly, in some wireless communication systems a neighbor base station is normally referred as a neighbor cell.
Multiple transmit and/or receive chains are commonly used in many wireless communication systems for different purposes. Multiple transmit and/or receive chains in wireless communication systems offer spatial dimension that can be exploited in the design of wireless communication systems. Communication systems with multiple transmit and/or receive chains offer improved performance. The performance improvement can be in terms of better coverage, higher data rates, reduced Signal to Noise Ratio (SNR) requirements, multiplexing of multiple users on the same channel at the same time, or some combination of the above. Different techniques using multiple receive and/or transmit chains are often referred to with different names such as diversity combining (maximum ratio combining, equal gain combining, selection combining, etc.), space-time coding (STC) or space-time block coding (STBC), spatial multiplexing (SM), beamforming and multiple input multiple output (MIMO). Normally wireless communication systems with multiple transmit chains at the transmit entity and multiple receive chains at the receive entity are referred to as MIMO systems. The aspects of the present invention apply to Spatial Multiplexing MIMO systems, i.e., wireless communication systems that use the Spatial Multiplexing technique using multiple transmit chains at the transmit entity and multiple receive chains at the receive entity.
In Spatial Multiplexing, a high data rate signal is split into multiple lower data rate streams and all the lower data rate streams are transmitted, with suitable precoding, simultaneously from all the available transmit antennas on the same frequency at the same time. Alternatively, data from two different users or applications may be transmitted simultaneously from all the available transmit antennas on the same frequency at the same time. If signals from different transmit antennas arrive at the receiver antennas through sufficiently different spatial propagation paths, the receiver may be able to separate these streams of data, creating parallel channels on the same frequency at the same time. SM is a powerful technique for increasing channel capacity at higher SNR. The maximum number of spatially multiplexed data streams is limited by the minimum of the number of antennas at the transmit entity and the number of antennas at the receive entity. For example, if the number of transmit antennas at the transmit entity is four and the number of receive antennas at the receive entity is two, the maximum number of spatially separable data streams is two.
FIG. 2 illustrates an example of an SM-MIMO wireless communication system with four transmit chains at the transmit entity, for example the base station, and four receive chains at the receive entity, for example the client terminal.
The signal from a transmit chain arrives at all four receive chains through different propagation paths as shown in the FIG. 2. The received signal at each receive chain may be a combination of signals transmitted from all four transmit chains and the noise as shown in FIG. 2.
The following notation is used in describing various signals in the remainder of the document. A subscript to a signal name denotes transmit or receive chain number to which the signal is associated. When there are two subscripts to a signal name, the first subscript refers to the transmit chain and the second subscript refers to the receive chain to which the signal is associated. Let Nt denote the number of transmit chains and Nr denote the number of receive chains. For SM the number of parallel data streams that can be supported is equal to the minimum of the number of transmit antennas Nt and the number of receive antennas Nr. Normally a wireless communication system with Nt transmit chains at the transmit entity and Nr receive chains at the receive entity is referred as Nt×Nr. MIMO communication system.
Wireless communication systems use different modulation techniques such as Quadrature Phase Shift Keying (QPSK), 16-Quadrature Amplitude Modulation (QAM), 64-QAM, etc. FIG. 3 illustrates a 16-QAM constellation and FIG. 4 illustrates a 64-QAM constellation. The set of all symbols in a given modulation technique is referred as constellation or alphabet. Let the total number of symbols in a constellation be denoted by L and the set of all symbols ak of a constellation be denoted by A={ak, ∇k=0, 1, 2, . . . , L−1}. At a given instant, one symbol that represents the input data at the modulator is selected from the constellation for transmission.
Let the transmitted symbol at a given instant of time from the ith transmit chain be denoted by si for 1=0, 1, . . . , (Nt−1). Let the received symbol at a given instant of time at the jth receive chain be denoted by xj for j=0, 1, . . . , (Nr−1). Let the noise at a given instant of time at the jth receive chain be denoted by nj for j=0, 1, . . . , (Nr−1). The symbols s1 used for transmission may be one of the symbols from the constellation of a selected modulation technique at the transmit entity.
Let channel conditions between transmit antenna i and receive antenna j be denoted by for hi,j=0, 1, . . . , (Nt−1) and j=0, 1, . . . , (Nr−1). Mathematically, the relationship between the transmitted symbols, the channel conditions, the noise and the received symbols can be expressed as follows for the case of a wireless communication system with four transmit chains and four receive chains:x0=h0,0s0+h1,0s1+h2,0s2+h3,0s3+n0  (1)x1=h0,1s0+h1,1s1+h2,1s2+h3,1s3+n1  (2)x2=h0,2s0+h1,2s1+h2,2s2+h3,2s3+n2  (3)x3=h0,3s0+h1,3s1+h2,3s2+h3,3s3+n3  (4)In matrix notation, for the case of Nt transmit chains and Nr receive chainss=[s0,s1, . . . ,sNt-1]T  (5)x=[x0,x1, . . . ,xNr-1]T  (6)n=[n0,n1, . . . ,nNr-1]T  (7)
                    H        =                  [                                                                      h                                      0                    ,                    0                                                                                                                                                                              h                                      1                    ,                    0                                                                              …                                                              h                                                                                    N                        t                                            -                      1                                        ,                    0                                                                                                                        h                                      0                    ,                    1                                                                                                                                                                              h                                      1                    ,                    1                                                                                                                                                                              h                                                                                    N                        t                                            -                      1                                        ,                    1                                                                                                                                                                                    ⋮                                                                                                                          ⋱                                            ⋮                                                                                      h                                      0                    ,                                                                  N                        r                                            -                      1                                                                                                                                                                                                  h                                      1                    ,                                                                  N                        r                                            -                      1                                                                                                  …                                                              h                                                                                    N                        t                                            -                      1                                        ,                                                                  N                        r                                            -                      1                                                                                                    ]                                    (        8        )            x=Hs+n  (9)
In EQ. (9), s is the transmitted symbols vector, H is the channel matrix, n is noise vector and x is the received signal vector.
Normally, the receiver of the wireless communication system needs to estimate the channel conditions to process the received signals. It is understood that the receiver obtains the required estimates of the channel conditions through techniques known in literature or through some other techniques. Let the estimated channel conditions between transmit antenna i and receive antenna j be denoted by ĥi,j, for i=0, 1, . . . , (Nt−1) and j=0, 1, . . . , (Nr−1) and let H denote the matrix of estimated channel conditions.
At the receive entity, the received symbols vector x is known. The channel conditions matrix H may be approximated by the estimated channel conditions matrix Ĥ. Based on these two known matrices, the transmitted symbols vector s may be estimated as ŝ by solving the linear system of equations in EQ. 9.
The system of equations represented by EQ. 9 needs to be solved at a rate proportional to the data rate of the wireless communication system. Normally SM-MIMO is used to achieve high data rate in wireless communication systems. Hence the system of equations represented in EQ. 9 needs to be solved at a faster rate. For example, in a broadband wireless communication system that offers data rate of 16 megabit per second over the air using 4×4 SM-MIMO with 16-QAM, EQ. 9 needs to be solved about one million times per second. Therefore, in general the complexity of the SM decoder is high. Further, the complexity of SM decoder normally grows exponentially as a function of the number of transmit chains and receive chains. Therefore, it is crucial to solve the system of equations represented by EQ. 9 in an efficient manner so that the wireless communication system can operate in real time with less processing resources and consumes less power.
Different optimal and sub-optimal decoders are described in the literature to solve the system of equations represented by EQ. 9. The Maximum Likelihood Decoder (MLD) is an optimal decoder for SM. Although MLD provides, theoretically, a best achievable decoding performance, its complexity and processing requirements are normally very high even for the common MIMO wireless communication systems such as 2×2 or 4×4 SM MIMO with 16-QAM or 64-QAM.
QR Decomposition (QRD) in conjunction with M-algorithm, referred as QRD-M decoder and also called QRD-M method, is one of the commonly used sub-optimal SM decoders. QRD-M sub-optimal SM decoder provides decoding performance close to that of the optimal SM decoder such as MLD, but requires reduced complexity and processing requirements. The reduced complexity and reduced processing requirements of QRD-M sub-optimal SM decoder makes it better suited for practical implementation. The QRD-M decoder used for SM is referred herein as QRD-M SM decoder.
Although the sub-optimal decoders are less complex and require less processing when compared to the optimal decoders, the complexity of the sub-optimal decoders still remain high. Therefore, it is desirable to further reduce the complexity of the sub-optimal decoders. Reduction in complexity results in less resource requirements and reduced power consumption. Since the decoding operations are performed at a very high rate such as millions of times per second, any reduction in processing requirements leads to significant reduction in power consumption, latency and/or increase in throughput.
The conventional QRD-M SM decoder consists of two main processing blocks as shown in FIG. 5. The first main processing block is the QR decomposition followed by matrix multiplication and the second main processing block is the M-algorithm. The QR decomposition block decomposes the channel matrix H into a right triangular matrix R and a unitary matrix Q using QR matrix decomposition method. Specifically,H=QR  (10)
Since R is a right triangular matrix, all its elements below the main diagonal are zero. A property of a unitary matrix is that its inverse can be obtained by its Hermitian transpose. Specifically,Q−1=QH  (11)Therefore,QHQ=I  (12)where I is an identity matrix. The Hermitian transpose of a unitary matrix is also a unitary matrix. Also when a vector is multiplied by a unitary matrix, the magnitude of the vector does not change. The unitary matrix Q is in general WG matrix. A discussion of the fundamentals of matrix computations may be found in the text entitled Matrix Computations, The Johns Hopkins University Press, 2nd Ed., 1989, by G. H. Golub and C. F. Van Loan, the entire disclosure of which is hereby expressly incorporated by reference herein.
Substituting H from EQ. 10 in the expression for the received signal vector represented by EQ. 9:x=QRs+n  (13)Pre-multiplying both sides with QH,QHx=y=QHQRs+QHn=Rs+w  (14)where y is the rotated received signal vector x and w is the rotated noise vector n. Then EQ. 14 becomesy=Rs+w  (15)
For the case of 4×4 SM-MIMO, the expanded version of EQ. 15 is as follows:
                              [                                                                      y                  0                                                                                                      y                  1                                                                                                      y                  2                                                                                                      y                  3                                                              ]                =                                            [                                                                                          r                                              0                        ,                        0                                                                                                                        r                                              1                        ,                        0                                                                                                                        r                                              2                        ,                        0                                                                                                                        r                                              3                        ,                        0                                                                                                                                  0                                                                              r                                              1                        ,                        1                                                                                                                        r                                              2                        ,                        1                                                                                                                        r                                              3                        ,                        1                                                                                                                                  0                                                        0                                                                              r                                              2                        ,                        2                                                                                                                        r                                              3                        ,                        2                                                                                                                                  0                                                        0                                                        0                                                                              r                                              3                        ,                        3                                                                                                        ]                        ⁡                          [                                                                                          s                      0                                                                                                                                  s                      1                                                                                                                                  s                      2                                                                                                                                  s                      3                                                                                  ]                                +                      [                                                                                w                    0                                                                                                                    w                    1                                                                                                                    w                    2                                                                                                                    w                    3                                                                        ]                                              (        16        )            
In case the number of receive chains at the receive entity is greater than the number of transmit chains at the transmit entity, all the elements in the bottom Nr−Nt rows of the right triangular matrix R are zero and the bottom Nr−Nt rows of the column vector y are also zero after QR decomposition. Therefore, the system of equations represented by EQ. 15 is simplified to Nt×Nt system of linear equations. In the remainder of this disclosure, the R matrix is considered to be an Nt×Nt matrix.
The second main processing block of the QRD-M SM decoder, namely M-algorithm, is described next. The solution of the system of equations represented in EQ. 15 using M-algorithm may be obtained in several stages. The number of stages in the M-algorithm corresponds to the number of rows in the system of equations and the M-algorithm is applied sequentially to each stage. The value of M in the M-algorithm refers to the number of best symbol sequences used for further consideration in a sequential decoding process. The best symbol sequences are the symbol sequences from the constellation selected based on minimum distance metrics. The M-algorithm for each stage includes two major processing steps. First, it computes all the distance metrics for a given stage. Next it selects M best symbol sequences for the next stage of processing. The selected M best symbol sequences are referred as surviving symbol sequences for the next stage. This process continues for all stages and at the last stage one best symbol sequence is selected as the decoded symbols vector ŝ. A 4×4 SM-MIMO wireless communication system, as represented in EQ. 16, using 16-QAM is chosen to illustrate the M-algorithm. For the chosen example, as represented in EQ. 16, the number of stages for M-algorithm is four. In QRD-M SM decoder, the M-algorithm starts by first operating on the bottom-most row corresponding to a single non-zero element in the R matrix. For the chosen example, as represented in EQ. 16, the M-algorithm starts with the fourth row containing the single non-zero element r3,3 in matrix R.
To solve the equation represented by the bottom-most row containing a single non zero element, all possible values for s(Nt-1) from the constellation alphabet A used by the transmit entity may be multiplied with element r(Nt-1),(Nt-1) of matrix R and subtracted from element y(Nt-1) of vector y to compute the distance metrics d(Nt-1) for all possible values of s(Nt-1). For the chosen example, as represented in EQ. 16, to solve the equation represented by the fourth row containing a single non zero element r3,3, all possible values for s3 from the constellation alphabet A used by the transmit entity may be multiplied with r3,3 and subtracted from y3 to compute the distance metrics d3 for all possible values of s3. For the chosen example, as represented in EQ. 16, with 16-QAM used by the transmit entity, the number of distance metric computations at the receive entity for the fourth row is 16, corresponding to 16 possible values for s3.
For the chosen example, as represented in EQ. 16, M is used for the M-algorithm. For the chosen example, as represented in EQ. 16, this results in the selection of 8 best symbol sequences with minimum distance metrics from the total of distance metrics corresponding to L=16 symbol sequences. These selected 8 (M=8) symbol sequences are referred as surviving symbol sequences. At the first stage, the symbol sequences of length one and at the subsequent stages the symbol sequences grow by one symbol in length at each stage as the stages progress.
Next, the M-algorithm enters the second stage of processing. In the second stage of processing, the M-algorithm operates on row (Nr−2). For the chosen example, as represented in EQ. 16, the M-algorithm operates on the third row which is immediately above the fourth row. At the second stage of M-algorithm, there are 16 possible values for s2 and 8 selected surviving symbol sequences from the previous stage. This requires 16×8=128 total number of distance metric computations corresponding to 128 different combinations of s2 and s3. The distance metrics computed in the second stage are cumulative distance metrics corresponding to the distance metric of a symbol sequence (s2, s3) and the distance metric of the selected surviving symbol sequences for s3 during the first stage. The M-algorithm then selects 8 best surviving symbol sequences corresponding to the minimum cumulative distance metrics. The surviving symbol sequences are of length two at this stage.
Next, the M-algorithm enters the third stage of processing. In the third stage of processing, the M-algorithm operates on row (Nr−3). For the chosen example, as represented in EQ. 16, the M-algorithm operates on the second row which is immediately above the third row. At the third stage of M-algorithm, there are 16 possible values for s1 and 8 selected surviving symbol sequences from the previous stage. This requires 16×8=128 total number of distance metric computations corresponding to 128 different combinations of s1, s2 and s3. The distance metrics computed in the third stage are the cumulative distance metrics corresponding to the distance metric of a symbol sequence (s1, s2, s3) and the distance metric of the selected surviving symbol sequence for (s2, s3) during the second stage. Next, the M-algorithm selects 8 best surviving symbol sequences corresponding to the minimum cumulative distance metrics. The surviving symbol sequences are of length three at this stage.
This process continues for each stage until the last stage, which corresponds to the first row of EQ. 16 is reached. After computing the cumulative distance metrics for the last stage, one best surviving symbol sequence is selected as the decoded symbols vector ŝ. In case where the decoding is successful the decoded symbols vector is equal to the transmitted symbols vector, i.e., ŝ=S. For the chosen example, as represented in EQ. 16, at the last stage the M-algorithm operates on the first row. Therefore, at the last stage of the M-algorithm, there are 16 possible values for s0 and 8 selected surviving symbol sequences from previous stage. This requires 16×8=128 total number of distance metric computations corresponding to 128 different combinations of s0, s1, s2 and s3. The distance metrics computed in the last stage are the cumulative distance metrics corresponding to the distance metric of a symbol sequence (s0, s1, s2, s3) and the distance metric of the selected surviving symbol sequence (s1, s2, s3) during the third stage. Next, the M-algorithm selects one best surviving symbol sequence ŝ=[ŝ0, ŝ1, ŝ2, ŝ3]T corresponding to the minimum cumulative distance metric. FIG. 6 shows the general processing flow diagram of the M-algorithm for Nt stages.
The value of M may be chosen according to the required decoding performance and processing complexity tradeoff. The smaller the value of M, the lesser the complexity and processing requirements, which may lead to reduction in power consumption.
However, a smaller value of M also reduces the decoding performance.
Two major areas of complexity in the M-algorithm for each stage are: the computation of distance metrics and selection of best surviving symbol sequences corresponding to the minimum distance metrics. The computation of distance metrics in general may require complex multiplications. Since there may be hundreds of distance metric computations for one pass of QRD-M SM decoder, the number of required complex multiplications is generally high.
In general, when using an Nt×Nr SM, there will be Nt processing stages in the M-algorithm of the QRD-M SM decoder. If a modulation scheme with constellation size L is used by the transmit entity, then the following distance metrics computations may be performed by a traditional M-algorithm:                For the first stage: L distance metric computations over symbol sequences consisting of length one.        For the second stage: M×L distance metric computations over symbol sequences consisting of length two.        For the third stage: M×L distance metric computations over symbol sequences consisting of length three.        For the Nt-th stage: M×L distance metric computations over symbol sequences consisting of length Nt.        
In addition to the distance metric computations, the selection operations may be performed based on minimum distance metrics at each stage.
Precomputation logic is a sequential logic optimization method used to reduce power consumption at logic level. The key optimization step is the synthesis of the precomputation logic, which computes the output values of a logic circuit at least one clock cycle before they are required.
If the output values of the original logic circuit can be precomputed using the precomputation logic for a subset of input conditions, the original logic circuit may be turned off and thus may not have any internal switching activity in the succeeding clock cycle.
A precomputation architecture is shown in FIGS. 7(a) and 7(b). FIG. 7(a) shows the original logic without any precomputation where the inputs are registered using R1, the block implements logic A, and output is registered using R2. For illustration purposes it is assumed that there are n inputs namely x1, x2, . . . , xn to the logic block and a single output f. FIG. 7(b) shows the same logic as in FIG. 7(a) but enhanced with the precomputation logic. It defines two Boolean predictor functions g1 and g2 satisfying the following conditions:g1=1→f=1  (17)g2=1→f=0  (18)
During a clock cycle t if either g1 or g2 evaluates to a 1, the Latch Enable (LE) signal of the register R1 is set to be 0. This means that in clock cycle t+1 the inputs to the combinational logic block A do not change. If g1 evaluates to a 1 in clock cycle t, the input to register R2 is a 1 in clock cycle t+1, and if g2 evaluates to a 1, then the input to register R2 is a 0. Note that g1 and g2 cannot both be 1 during the same clock cycle due to the conditions imposed by EQ. 17 and EQ. 18.
For a subset of input conditions corresponding to (g1+g2), the inputs to block A do not change thereby implying zero switching activity resulting in power reduction. The power reduction is achieved by additional logic corresponding to functions g1 and g2 and few additional gates. The precomputation logic functions g1 and g2 add to the critical path delay that end at register R1. In general, the functions g1 and g2 may be chosen such that the increase in additional logic is acceptable and the additional delay included does not affect the critical paths of the design. A power reduction in logic block A is obtained because for a subset of input conditions corresponding to g1+g2, the inputs to logic block A do not change which may lead to no switching activity in the subsequent part of the circuit.