(Note: This application references various publications as indicated throughout the specification by reference numbers enclosed in brackets, e.g., [x]. A list of these publications ordered according to these reference numbers can be found below in the section entitled “References.” Each of these publications is incorporated in its entirety by reference herein.)
Wireless transmission through multiple antennas, also referred to as MIMO (Multiple-Input Multiple-Output) [1]-[2], currently enjoys great popularity because of the demand of high data rate communication from multimedia services. Many applications are considering the use of MIMO to enhance the data rate and/or the robustness of the link; among others, a significant example is provided by the next generation of wireless LAN networks, of which the standard is currently under definition (IEEE 802.11n) [3]. Another candidate application is represented by mobile “WiMax” systems for fixed wireless access (FWA) [4]-[5]. Also, fourth generation (4G) mobile terminals will likely endorse MIMO technology and as such represent a very important commercial application for embodiments of the present disclosure.
An embodiment of the present disclosure is concerned with the problem of detecting multiple sources corrupted by noise in MIMO fading channels. The linear complex baseband equation representative of narrow band MIMO system is:Y=HX+N  (1)where R and Tare the number of receive and transmit antennas respectively,Y=[Y1Y2 . . . YR]T is the received vector (size R×1),X=[X1X2 . . . XT]T is the transmitted vector (size T×1), H is the R×T channel matrix, whose entries are the complex path gains from transmitter to receiver, samples of zero mean Gaussian random variables (RVs) with variance σ2=0.5 per dimension. N is the noise vector of size R×1, whose elements are samples of independent circularly symmetric zero-mean complex Gaussian RVs with variance σN2=N0/2 per dimension. Equation (1) is considered valid per subcarrier for wideband orthogonal frequency division multiplexing (OFDM) systems.
Maximum-Likelihood (ML) detection is desirable to achieve high-performance, as this is the optimal detection technique in presence of additive white Gaussian noise (AWGN) [6]. It corresponds to finding the transmitted vector X which minimizes the minimum of the squared norm of the error vector (i.e., its squared norm, ∥.∥2):
                              X          D                =                  arg          ⁢                                          ⁢                                    min              x                        ⁢                                                                            Y                  -                  RX                                                            2                                                          (        2        )            where the notation corresponds to the commonly used linear MIMO channel with i.i.d. Rayleigh fading and ideal channel state information (CSI) at the receiver is assumed. ML detection involves an exhaustive search over all the possible ST sequences of digitally modulated symbols, where S is a Quadrature Amplitude Modulation (QAM) or Phase Shift Keying (PSK) constellation size, and T is the number of transmit antennas; this means it becomes increasingly unfeasible with the growth of the spectral efficiency.
Because of their reduced complexity, sub-optimal linear detection algorithms like Zero-Forcing (ZF) or Minimum Mean Square Error (MMSE) [7] are widely employed in wireless communications. They belong to the class of linear combinatorial nulling detectors, i.e., the estimates of each modulated symbol are obtained considering the other symbols as interferers and performing a linear weighting of the signals received by all the receive antennas. ZF and MMSE schemes are highly sub-optimal, since they yield a low spatial diversity order: for a MIMO system with 7 transmit and R receive antennas, this is equal to R−T+1, as opposed to R for a ML [20].
To improve their performance, nonlinear detectors based on the combination of linear detectors and spatially ordered decision-feedback equalization (O-DFE) were proposed in [8]-[9]. There, the principles of interference cancellation and layer ordering are established. In the remainder of this document terms “layers” and “antennas” will be interchangeable.
First, a stage of ZF or MMSE linear detection, also called interference “nulling”, is applied to determine T symbol estimates. Based on the “post-detection” signal-to-noise ratio (SNR), the first layer is detected. Then, each sub-stream in turn is considered to be the desired signal and the other are considered as “interferers”; interference from the already detected signals is cancelled from the received signal, and nulling is performed on modified received vectors where, effectively, fewer interferers are present. This process is called “interference cancellation (IC) and nulling” or, equivalently, spatial DFE. In case of IC, the order in which the transmit signals are detected is critical for the performance. An optimal criterion has been established, corresponding to maximizing the minimum SNR (“maxi-min” criterion) over all possible orderings. Fortunately, for T transmit antennas, it can be demonstrated that only T(T+1)/2 dispositions of layers have to be considered to determine the optimal ordering, instead of all the possible T!. However, nonlinear ZF or MMSE-based O-DFE detectors have a limited performance improvement over linear ZF or MMSE, due to noise enhancements caused by nulling and error propagation caused by IC. In addition, they still suffer from ill-conditioned channel conditions, as the linear detectors. Also, the complexity of the original version of this algorithm is very high, O(T4), as it involves the computation of multiple Moore-Penrose pseudo-inverse matrices of decreasing size sub-channel matrices. More recent efficient implementations exist [22], though, keeping a O(T3) complexity. Last, no strategy to compute the bit soft metrics has been proposed for O-DFE detectors.
A better performing class of detectors is represented by the list detectors [10]-[13], based on a combination of the ML and DFE principles. The common idea of the list detectors (LD) is to divide the streams to be detected into two groups: first, one or more reference transmit streams are selected and a corresponding list of candidate constellation symbols is determined; then, for each sequence in the list, interference is cancelled from the received signal and the remaining symbol estimates are determined by as many sub-detectors operating on reduced size sub-channels. Compared to O-DFE, the differences lie in the criterion adopted to order the layers, and in the fact that the symbol estimates for the first layer (i.e., prior to interference cancellation) are replaced by a list of candidates. The best performing variant corresponds to searching all possible S cases for a reference stream, or layer, and adopting spatial DFE for a properly selected set of the remaining T−1 sub-detectors. In this case, numerical results demonstrate that the LD detector is able to achieve full receive diversity and a SNR distance from ML in the order of fractions of dB, provided that the layer order is properly selected. A notable property is that this can be accomplished through a parallel implementation, as the sub-detectors can operate independently. The optimal ordering criterion for LDs stems from the principle of maximizing the worst case post-detection SNR (“maxi-min”), as proposed for the O-DFE [9]. This was first proposed in [11] and then re-elaborated in [12]-[13], and results in computing the O-DFE ordering for T sub-channel matrices of size R×(T−1) thus entailing a complexity O(T4). A simplified suboptimal ordering criterion is contained in both [13] and [14].
The LDs may also suffer from some major drawbacks. In particular, we refer to the “parallel detection” (PD) algorithm [11] and the additional implementation details contained in [12]-[13]. They all suffer from a high computational complexity as T O-DFE detectors acting on R×(T−1) sub-channel matrices have to be computed; this involves the computation of the related Moore-Penrose sub-channel pseudo-inverses. In [12]-[13] they are efficiently implemented through T complex “sorted” QR decompositions [23]-[24], however the overall complexity is still in the order of O(T4). As previously mentioned, a simplified suboptimal ordering method is included in [13] and [14]. In the case when all the possible constellation symbols are searched for a reference layer and the rest of the layers are detected through spatial DFE, such an ordering technique corresponds to selecting as reference layer the one characterized by the worst case post-detection SNR; then O-DFE is performed on the remaining layers. However, in [13] this criterion is only drafted as a possible simplification of the optimal layer selection algorithm but neither its HW complexity nor the performance is provided; [14] provides only one simulation plot for an uncoded 4×4 16QAM MIMO system, but its processing uses a complex-domain Cholesky decomposition of the channel matrix to compute its pseudo-inverse, which entails high complexity too. Finally, another major shortcoming in list based detection is, to the best of our knowledge, the absence of an algorithm to produce soft bit metrics for use in modern coding and decoding algorithms.
Finally, it shall be remarked that another important family of ML-approaching detectors is given by the lattice decoding algorithms, applicable if the received signal can be represented as a lattice [15]-[16], i.e., through a proper real-domain representation of discrete signals. The so-called Sphere Decoder (SD) [17]-[18] is the most widely known example for these detectors and can be utilized to attain hard-output ML performance with significantly reduced complexity.
However SD may suffer from some important disadvantages; most notably, it is not suitable for a parallel VLSI implementation. This because it is a inherently serial detector. In other words, it spans the possible values for the I and Q PAM components of the QAM symbols successively and thus is not suitable for a parallel implementation. It should be noted that in order to slightly increase the degree of achievable parallelism, the authors in [19] resort to a complex domain version of the SD algorithm.
A related issue is that the number of lattice points to be searched is non-deterministic, sensitive to the channel and noise realizations, and to the initial radius. This is not desirable for real-time high-data rate applications; an example is given by high-throughput Wireless LANs 802.11n, whose standard definition is ongoing [3].
Finally, generation of soft output metrics may not be easy with known lattice decoding procedures, because the need to reduce the size of the search before converging to the ML-approaching transmitted sequence is not always compatible with the need of finding a number of (selected) sequences in order to generate bit soft-output information.
Besides performance (the benchmarks are optimal ML detection and linear MMSE, ZF on the two extremes, respectively) at least four features are typically needed for a MIMO detection algorithm to be effective and implementable in next generation wireless communication algorithms:                a reduced overall complexity;        near optimal performance;        the possibility to generate bit soft output values (or log-likelihood ratios, LLR, if in the logarithmic domain), as this yields a significant performance gain in wireless systems employing error correction codes (ECC) coding and decoding algorithms;        the capability of the architecture of the procedure to be parallelized, which is significant for an Application Specific Integrated Circuit (ASIC) implementation and also to yield the low latency often required by a real-time high-data rate transmission.        