Any modern communication system, especially a wireless commutation system, in order to provide acceptable performance in terms of packet error rate (PER) over fading communications channels, makes use of advanced forward error correction (FEC) schemes. These schemes may include simple traditional convolutional codes and block codes, or a combination of both, or, more recently, turbo codes (either convolutional or block) and low-density parity-check (LDPC) codes. All the coding schemes, though in different ways, process a bit-stream payload message (usually, of some predefined length k or a range of lengths {ki}) to be transmitted and generate a new longer message, or a codeword, of length N, containing the original payload message and (N−k) additional parity bits as the encoding function of the original message. Accordingly, the entire theoretically available number of possible received codewords of length N is 2N and, on the other hand, the number of different possibilities of transmitted codeword is only 2k. Since 2k<<2N, the decoder exploits the additional knowledge, provided by the redundant bits indexed between k+1 and N−k, to improve the recovery of the sent payload.
The aforementioned code schemes differ from the point of view of their encoding rules, decoding algorithms, and PER (Packet Error Rate) performance. One common point of all the different algorithms implemented by modern decoders, regardless of the coding type, is that they are soft-input soft-output (SISO) based. In other words, each transmitted ‘hard’ bit of the codeword is represented in the receiver by a number of bits of width D>1, often called a ‘soft bit’. A ‘soft bit’ represents some probabilistic information reflecting the likelihood of the corresponding ‘hard’ bit to be equal to ‘1’ or ‘0’, and is usually computed as the logarithm of the likelihood ratio (LLR), ln [p(1)/p(0)], where ln represents the natural (base e) logarithmic function, and p(1) and p(0) are the probabilities for the bit to be respectively ‘1’ or ‘0’. Any SISO FEC decoder receives an input stream of LLR values (soft bits), and produces a decoded stream of ‘hard’ bits that it believes to be equal to the transmitted payload message.
In modern communication systems, Quadrature Amplitude Modulation (QAM) is implemented. In this modulation, the bits to be transmitted are mapped to channel symbols in a modulation mapper, each group of bits to a distinct symbol. Each such symbol represents one of a preset number of possible states (hereinafter ‘M’), and is mapped onto a carrier signal. The number of bits included in each symbol equals the log of the sum of the different possibilities in the constellation diagram of the modulation scheme (hereinafter k=log2(M)). For example, a symbol in the QPSK modulation scheme includes 2 bits, since QPSK allows for 4 states. Similarly, a symbol in the 16QAM scheme includes 4 bits which is the log of the 16 possible states, a symbol in the 64QAM scheme includes 6 bits, and a symbol in the 256QAM includes 8 bits.
Typically, mapping k=log2(M) bits to an M-QAM symbol is integrated with the Bit-Interleaved Coded Modulation (BICM), where the k bits are interleaved in some way. Following the interleaving, half of the k bits are mapped on the real component of the symbol, while the other half are mapped on the imaginary component. In this way, each of the k-bits modulates only one of the components, either in-phase or quadrature. In the receiver, the received symbol is converted to a total of k LLR values, each corresponding to one of the k transmitted bits. In the receiver, most of the demodulation processing is run at the symbols rate, which is k times slower than the bit-rate. The situation changes at the point where the demodulated soft symbols are converted into soft bits, which are to be produced at the bit-rate. Thus, in order to output a payload complying with high data rates dictated by bit rates, the receiver must include strong calculating capabilities. For example, if the required output payload is 600 Mbit per second, the receiver must compute at least 600 Mega LLR values per second. However, since k FEC redundant bits are usually appended to each codeword, the actual rate of LLR values to be computed grows by a factor which is the inverse of the coding rate. For example, if each N bits of payload are appended with k=2N redundant bits, the coding rate is R=N/(N+k)=⅓. Thus, if the net throughput is 600 Mbps, the gross bit rate becomes 1.8 Gbps, if all the bits are transmitted over the air, and the receiver must produce LLR values at this rate. In practice though, when a very low coding rate is used, this usually implies bad link conditions, thus, maximum throughput couldn't be achieved. Yet, even moderate coding rates, for example, ¾<R<1 require an output payload of 600-800 Mbit per second.
Calculation of each LLR value involves a set of instructions, such as an arithmetic instruction, a logic instruction, a data instruction, or a control flow instruction, each of which is represented by a number, or sequence of numbers. Typically, computing each LLR value requires 10 instructions or more, thus, the processor carrying out these instructions must perform between 6 and 8 Giga instructions per second. Due to the very high bit rate of the data being transmitted, dedicated hardware is typically employed to implement these LLR calculations.
In order to improve the reliability of wireless links, HARQ combining has been recently widely adopted in the industry. HARQ combining is a key technology in next generation wireless systems that spans both MAC and PHY layers, and exploits time/frequency diversity and coding gain. In the HARQ combining scheme, incorrectly received codewords are stored at the receiver rather than discarded, and when the retransmitted codeword is received, the two words are combined. While it is possible that when independently decoded, two given transmissions cannot be decoded error-free, it may happen that the combination of all the erroneously received transmissions gives enough information to correctly decode the message. There are two main methods of re-combining in HARQ:                Chase combining: every retransmission contains the same information (data and parity bits) and contributes more signal power;        Incremental redundancy: every retransmission contains some different information than the previous one. At every retransmission, the receiver gains knowledge of extra information.However, HARQ combining requires an additional set of instructions, which increases the need for strong processing abilities in the processor.        
In addition, in modern communication protocols, such as, but not limited to, the 3GPP LTE standard, both HARQ combining approaches are dynamically applied. In receivers implementing these interchanging protocols, a very flexible retransmission and rate matching algorithm is adopted, where every retransmission version of a codeword can consist of both already transmitted bits and bits that are transmitted for the first time. A requirement for flexibility when utilizing HARQ combining thus arises, which typically would be answered by utilizing software. On the other hand, since the LLR computation and HARQ combining are performed on soft bits rather than on soft symbols, the receiver must be configured to sustain high bit rates. Traditionally, providing the ability to sustain higher bit rate is carried out by a hardware-oriented implementation, which would be much more efficient from the point of view of required silicon area and consumed power at the expense of flexibility, compared to a software-based solution.
Accordingly, there is a long felt need for a flexible solution for high rate LLR computation, and it will be very desirable to have such a solution that will also allow HARQ combining and additional calculations.