In WCDMA receivers channel estimations are usually performed by measuring one or more pilot channels, wherein known data is transmitted. Under certain channel conditions, the channel estimates that result from measuring pilot channels are no longer precise enough for channel demodulation. Additional techniques have been created to estimate the channels more accurately so as to create high performance receivers. One such way to increase the precision of channel estimates is to base the channel estimations on the covariance (or correlation) between different fingers (delayed signal components) that are received on the data channels. In HSDPA, up to 15 data channels are transmitted at the same time. As such, an opportunity to compute the covariance between the different channels in order to improve channel estimates is available. It is well understood that computing a so-called covariance matrix for 15 data channels is a computationally complex and intensive task that utilizes a significant amount of memory, read and write power as well as microprocessor computational power in a battery powered computation device, such as a mobile communication device.
In the wide band CDMA standard, multiple channels are sent in parallel. These channels are both control channels (including pilot channels) and data channels, but for now we are discussing the data channels. Each data channel is identified by its own spreading code. By despreading the incoming data stream with the correct spreading code, one can extract the particular data of the channel of interest. In HSDPA (high speed data packet access) system, which is the high speed data extension of the original wide band CDMA standard, there are 15 data channels, wherein there is one spreading code for each of the 15 channels and all the channels are transmitted in parallel. An efficient technique for despreading all the coded channels from the received signals is to use a Hadamard despreader, also interchangeable referred to as a fast Hadamard transform (FHT). The FHT or Hadamard despreader is an efficient implementation of the process of despreading 16 symbols with 16 orthogonal codes of length 16. The Hadamard despreader or FHT generates 16 outputs, but it is understood that a FHT can have various sizes to produce more or less than 16 outputs. In a HSDPA, only up to 15 spreading codes are used, thus only 15 of the 16 Hadamard despreader outputs are needed.
A spreading code in the context herein is a vector of length 16, where every element is a complex value from the set (−1−j, −1+j, 1−j, 1+j). Furthermore, for the spreading and despreading to work properly, the spreading codes that are used are mutually orthogonal. There are 16 of these orthogonal codes of length 16; 15 of them are used in HSDPA.
In order to compute a covariance matrix from despread HS-PDSCH (High Speed Physical Data Shared Channel) data, one current solution uses a Hadamard despreader that executes sequentially on every rake finger, wherein each rake finger is essentially an offset in the received signal's sample stream. An intermediate symbol buffer (ISB) is used to buffer the despread data, which will ultimately be provided to a correlator core that computes the correlation between rake-finger combinations. The rake-finger is part of a rake receiver, which is a radio receiver designed to counter the effects of multipath fading. The rake receiver does this by using several subreceivers called “fingers”. When a signal is transmitted, it often reflects off of buildings and other obstacles creating multipath fading. A rake receiver receives the original signal distorted by many, slightly delayed, copies of the original signal. The different delayed versions of the signal are referred to as “delayed paths”. While the main component of a signal starts at a certain position in a sample stream, the signal is also present (although sometimes weaker, at about the same amplitude or stronger if not in line of sight), at slightly later points in the sample stream. These delayed paths carry some of the transmitted energy. In order to get the best reception quality possible, a receiver may be designed to capture as much of the transmitted energy as possible. A rake receiver is a type of receiver that tries to accomplish this in a very direct way. Essentially, the rake receiver rakes in all the signal energy that it can by processing (despreading) the signal at all the different offsets where a strong delayed path is present and then combining (using weighted addition) the results of these different offsets into one improved result. A “rake-finger” is basically a part of a receiver that receives one of a plurality of strong delayed signal paths.
Referring to FIG. 1 a despreading core that includes a Hadamard despreader 100 is depicted. 16 samples are input at the top of the Hadamard despreader. These 16 samples may originate from an A to D converter connected to the receiving antenna. The 16 input samples may also be referred to as chips, which is a name used in the art for the basic unit of transmission in WCDMA. For each code that one wants to extract data from, there is a correlation performed to a code of length 16 in order to get out one symbol. If there is an interest in all 15 codes, then a normal despreader procedure is repeated 15 times. A faster technique may be to utilize a fast Hadamard transform (a FHT), which reuses partial expressions from the 16 input samples and performs multiplication, addition and subtraction on such samples in order to produce 15 output symbols at the bottom. Thus, a fast Hadamard transform is an efficient way of correlating all 15 codes for HSDPA in one procedure. The output 102 comprises 15 symbols. Thus, sets of 15 symbols (each associated with a different code) are sequentially output from a Hadamard despreader.
The 15 symbol outputs 102 produced by the Hadamard despreader every clock cycle are written into an intermediate symbol buffer (ISB). Logically this is a very wide memory of (12+12)*15(codes)*2(symbols)*2(carriers)=1440 bits. For this example, the wide memory is thus 1440 bits wide in order to provide enough throughput to the correlator core(s). Other configurations of this type may require even wider memories or somewhat narrower memories depending on the size and number of codes, symbols and carriers to be correlated. FIG. 2 depicts an existing example of an ISB 30. The rake input 32 may be equivalent to the symbol output 102 of the Hadamard despreader. The rake input 32 provides 15 complex values (one value for each code and every value may be 12+12 bits). The 12+12 bits is associated with the precision of the real and imaginary parts of the output symbols coming from the Hadamard spreader. The 12+12 number of bits can be lower or higher depending on the implementation (e.g., between about 8+8 to 16+16 bits). Thus, in the example shown for each clock cycle a symbol that is 12+12 is provided for each of the 15 codes, which totals 360 bits that are output by the Hadamard spreader and provided to the intermediate signal buffer 30 via the rake 32 every clock cycle.
A selector 34 is used to map the symbols received from the rake 32 into a very wide memory space 36. The memory space 36 is, for example, 4 times as wide as the 360 bit symbols which equals 1440 bits wide. The selector 34 determines where in the 1440 bit memory space that each of the 360 bit symbols should be written. In this depicted mapping of FIG. 2, for one rake finger, 2 subsequent symbols for a first carrier are stored next to each other in a first memory 38. The first memory has a bit width of 720 bits. Meanwhile, for another finger 2 subsequent symbols of a second or different carrier are stored next to each other in a second memory 40 that is also 720 bits wide. Thus in this example, two different carriers (carrier 0 and carrier 1) are being processed simultaneously (Note that other similar configurations may operate with only one carrier or with more than two carriers). Here with two carriers, there is a carrier 0 memory bank 38 and a carrier 1 memory bank 40 for writing the symbols for each of the two carriers into an organized manner or matrix. Thus, when the rake 32 provides an output for carrier 0, symbol n, finger m that data is stored in a first 360 bit memory location 42. Then when another output for carrier 0 symbol n+1, finger m is provided by the rake 32, the selector 34 directs the output to be stored in memory location 44. This organized storage process is also performed for carrier 1 data provided by the rake 32. Thus, the selector 34 directs the all symbol data provided by the rake 32 such that it is written into a designated memory location in the ISB buffer where the data waits to be correlated.
In this example there are six symbols (n to n+5) stored for each carrier, for each finger. Furthermore there are 48 fingers (m to m+47) for which six symbols are stored in an organized fashion within memory space 36. It is understood that for this example the number of fingers, number of carriers, and the size of the data are all implementation choices that are selected by one of ordinary skill in the art. Here, buffering 6 symbols is done because the correlator cores consume data at a faster rate from the ISB 30 than the Hadamard despreader 100 produces data for the ISB 30. Since the consumption of data by the correlator cores cannot overtake the production of data from the Hadamard despreader 100, some amount of data needs to be buffered into the ISB 30 before the correlator cores start to process it. In this example, it was determined that six symbols of data need to be buffered before the correlator cores started processing it. In this implementation, the number of codes, being 15, is directly related to the communication standard for HSDPA.
The memory space 36 operates as an integral part of the symbol buffer 30 such that the rake 32 provides data into the memory space at one rate while the correlator (not specifically shown) is emptying or reading the data from all the next memory locations of the memory space 36 at another rate. Writing and reading all the memory locations of this prior art ISB 30 configuration at a clock rate of about 208 megahertz is an enormous energy drain on a mobile device's energy source or battery used in, for example, a mobile communication device.
FIG. 3 provides a correlation formula that is the basis for correlating signals from a pair of rake-finger. Also referring back to FIG. 2, the first memory 38 and second memory 48 act as buffers wherein symbol pairs “effectively” move from the rake 32, through the selector 34 and are written next to each other in designated memory locations in the memory space 36. The pairs of symbols are then read in the order of being stored (i.e., FIFO) by the appropriate correlator core, which performs a covariance calculation using the correlation formula of FIG. 3 for each pair of symbols.
In the correlation formula of FIG. 3 Rd is the name given to a full correlation matrix that is indexed using f1 and f2, wherein f1 and f2 represent a first finger and a second finger that are being correlated with each other. The C is the set of active codes (wherein in this example there are 15 codes, but not all of them may be active); the c is a code number that is used in the summation of the active codes; gf1 is the output of the Hadamard computation for the first finger; the i is the symbol number; gf2 is the output of the Hadamard computation for the second finger, which is being correlated with gf1. The correlation formula is a double summation wherein one summation is done for each active code in C and the second summation is done 160 times which reflects the number of symbols in each HSDPA slot. The correlation formula of FIG. 3 is performed for each element in the correlation matrix. Since this example matrix of FIG. 2 is a 48×48 matrix, relating to 48 fingers of the rake receiver being used to calculate the mutual correlations between all the 48 individual finger's symbol streams. This means that there are (48×49)/2 matrix elements. Thus, there are 15 codes*160 symbols=2,400 operations for each element in the matrix, which is an enormous number of computations (close to 3 million computations) necessary to compute the mutual correlation between all fingers, for all codes, for one carrier, for example to correlate all the symbols stored in the first memory 38 of the ISB to correlator core 0. The matrix elements are calculated every WCDMA slot. Thus, since the slot rate in WCDMA is 1500/sec, this amounts to a computational load of up to (48*49)/2*(15*160*2*1500)=about 8.5 Giga complex multiplications per second for 2 carriers. The same is also true for the second memory 40, which provides an intermediate symbol buffered information of a second carrier to correlator core 1.
Referring now to FIG. 4, a block diagram of a correlation core having a first correlation block 50 is depicted. The first correlation block 50 performs the summation of the active codes in the correlation formula of FIG. 3, but only for one symbol. Therefore the accumulator 52 accumulates the summation at the addition point 62 for 160 symbols.
The g(i,c) is multiplied by a 0 or a 1 via the active indicator m(c)=1 or 0 56, which indicates whether the symbol g(i,c) 54 is associated with an active or inactive code. If the code is active, g(i,c) is multiplied by a 1. If inactive g(i,c) is multiplied by a 0. The result is then multiplied 58 by the conjugate of gf2 (i, c) 60. This method is repeated and is added for the 15 codes (i.e., up to gfx(i, c+14). Thus, at the bottom of the block the addition element 62 accumulates 160 times into the accumulator 52.
Some correlation cores may include a second correlation block 64 as part of, for example correlator core 0, so that correlation block correlates a second symbol for carrier 0. Correlator blocks 50 and 64 together correlate 2 symbols for carrier 0, so that it takes 80 iterations to correlate 160 symbols (instead of 160 iterations. The second correlation block 64 provides a result 66, which is added to the first correlation block's output at addition element 62. In this manner, the dual correlation core blocks 50 and 64 process two symbols in one clock cycle such that 160 symbols are accumulated using the addition element 62, which only accumulates 80 times (instead of 160 times or 160 clock cycles). The single or dual correlator core blocks processes symbols in parallel while masking out codes (using the 0 or 1 multiplier in the m(c) to m(c+15) elements of a correlation core). The masked codes have already been written to and read from the ISB 30, but since they have been deemed as inactive or unnecessary in the correlation calculation these codes are masked from the correlation process in the correlation cores.
Thus, for every symbol associated with a code and finger written to and read from the ISB 30 there are 15 versions from the 15 codes, but the prior existing system can only mask codes determined to be unnecessary for a correlation calculation in the correlation core blocks.
Again, in order to perform the correlation computation once for one carrier (using a dual correlation core comprising correlation blocks 50 and 64), correlation block 50 takes 15 inputs (1 symbol per code) of 12+12 bits, which equals 360 bits in total. Correlation block 64 accepts the same number of bits for a second symbol. Correlation blocks 50 and 64 together accept 30 symbols of input (2 symbols per code) totaling 720 bits.
If there are two carriers, then the dual correlation core blocks 50 and 64 are duplicated to establish two dual correlation cores 50 and 64 and 50′ and 64′ (not specifically shown), wherein 60 symbols, or 1440 bits are consumed as input from the intermediate symbol buffer 30. It is important to understand that the inputs 54 (finger 1) change every clock cycle, while inputs 60, finger 2 stay constant. Referring to dual correlation blocks 50, 64, in a first cycle two symbols for finger combination (1,1). In the next clock cycle, symbol 2 for finger combination (1,2) is correlated; Then in the next clock cycle (1,3); Then (1,4) and so on. Again, since the first finger stays constant, inputs 60 don't change. Only inputs 54 are read from the ISB, which is 15 symbols for correlation block 50 (360 bits) and 15 symbols for block 64 360 bits
The drawbacks of the above existing technique for performing correlation or covariance computations are various. First there is the problem of time and timing. That is, an entire matrix must be calculated within the time of one slot of 160 symbols. A slot in WCDMA consists of 2560 chips. With an SF of 16, this means that a slot contains 2560÷16=160 symbols per code. In other words, two symbol pairs must be provided to the correlation core (720 bits+720 bits=1440 or 60 symbols) each clock cycle by the intermediate symbol buffer 30 just to keep up with the correlation cores computation rate. The speed and movement of the data is very important here because the data is being used for channel estimation wherein the result of the channel estimation computations are most valuable if the channel estimation result is determined before the channel changes. It is important to have the channel estimation in time before the estimation becomes irrelevant due to channel changes, which occur regularly in mobile communication devices.
When, for example, the correlation core's clock frequency is about 208 MHz then 1440 bits must be moved from a very wide 1440 bit memory to the intermediate symbol buffer 30 every 4.8 microseconds, which is a large power drain on a mobile device's battery system. By calculating the amount of energy required to read 1440 bits from a memory and then multiplying that number by the clock rate of 208 MHz, it was determined that the process required too large of a mW drain on a portable communication device's battery for performing just this correlation computation operation. What is needed is an intermediate symbol buffer design and method for writing symbols associated with spreading codes, provided by a Hadamard despreader into an intermediate symbol buffer and read from the same intermediate symbol buffer in a manner that consumes significantly less power than the pre-existing techniques.