Long term evolution (LTE) and other radio communications technologies can require significant infrastructure and configuration. Generally, network operators test various aspects of their network equipment to ensure reliable and efficient operation. Network operators typically simulate various conditions before equipment is deployed in a live network to decrease avoidable delays and/or other problems.
Various technical specifications, such as the 3rd Generation Partnership Project (3GPP) Technical Specifications 36.211, 36.212, 36.213, and 36.214, hereinafter respectively referred to as “TS 36.211”, “TS 36.212”, “TS 36.213”, and “TS 36.214”, define aspects of LTE communications. Generally, data from the network to a user device is referred to as downlink data and data from the user device to the network is referred to as uplink data. For example, user equipment (UE), such as a cellular mobile phone, a laptop, other user device, may communicate with an enhanced or evolved Node B (eNode B) via the cellular radio transmission link. Data that is sent from the eNode B to the user device is downlink data, and data that is sent from the user device to the eNode B is uplink data.
FIG. 1 shows a conventional LTE system in which an enhanced Node B (ENB) 100 communicates with an LTE user equipment (UE) 102. UE 102 communicates with ENB 100 via a radio frequency input/output interface 104. The signals from interface 104 are decoded and processed by a common public radio interface (CPRI) processor 106, which processes both downlink data, i.e., data from ENB 100 to the UE and uplink data, i.e., data from the UE to ENB 100. Downlink data undergoes downlink processing 108 on its way to a media access control (MAC) layer 110. Uplink data provided by MAC 110 undergoes uplink processing 112 on its way to CPRI 106.
Uplink and downlink data includes separate channels defined in the physical layer of the protocol stack, herein referred to as “physical channels.” During both downlink processing 108 and uplink processing 112, the data transmitted via the physical channels may be processed by separate physical circuits, or they may be processed by the same circuit but as distinct logical channels or entities.
Two of the physical channels processed during downlink processing 108 are the physical downlink shared channel (PDSCH) and a physical downlink control channel (PDCCH), which conveys downlink control information (DCI) to UE 102. ENB 100 uses PDCCH to indicate to each UE what scheduled radio resources for uplink and downlink are available to that UE. DCI data is used to specify the resources (e.g., frequencies, time slots, etc.) that ENB 100 is allowing the UE to use for uplink and downlink, which is referred to as “grant” information. Depending on how much data the UE wants to send or receive, how many other UEs are trying to access the same eNode B, and other factors, the terms of the grant can and usually do change at every transmit time interval, or TTI. Other physical channels and signals include the physical control format indicator channel (PCFICH), the physical broadcast channel (PBCH), the primary synchronization signal (PSS), the secondary synchronization signal (SSS), and at least one reference signal (RS).
Two of the physical channels processed during uplink processing 112 are the physical uplink shared channel (PUSCH) and a physical uplink control channel (PUCCH), which conveys uplink control information (UCI) to ENB 100. UCI data includes scheduling requests and acknowledgement responses or retransmission requests (ACK and NACK.) However, PDCCH with a DCI format used to grant PUSCH transmissions as given by DCI format 0 is referred to as “uplink DCI” format when common behavior is addressed. Other physical channels and signals include the sounding reference signal (SRS) and the demodulation reference signal (DMRS). All of the physical channels are mapped on an orthogonal frequency-division multiplexing (OFDM) resource grid made up of resource elements (frequency) and ODFM symbols and slots (time).
FIGS. 2A and 2B illustrate two portions of conventional LTE uplink processing 112, referred to as “part 1” and “part 2”, respectively. Referring to FIG. 2A, uplink data is provided by MAC 110 in groups of data called transport blocks. The size of the transport block (TB) provided by MAC 110 is defined or determined by the grant information received from ENB 100. At step 200, a transport block cyclic redundancy check (TB CRC) is calculated and attached to the transport block. At step 202, the transport block and CRC is segmented into multiple code blocks and distributed for parallel processing. At step 204, a CRC value is calculated for and attached to the code blocks, which are then channel encoded (step 206), subjected to a sub-block interleave (step 208), and then rate matched (step 210). The steps of channel encoding, sub-block interleaving, and rate matching are of interest and are therefore logically grouped into a collection of steps 212. At step 214, the outputs of rate matchers 210 are concatenated and sent to step 216, where the data is multiplexed with uplink control information (UCI) that had been encoded in step 218. The multiplexed data then goes to a channel interleaving step 220. FIG. 2B shows a second portion of the process, which includes a scrambling step 222, a modulation mapping step 224, a transform pre-coding step 226, a resource element mapping step 228, and a SC-FDMA signal generation step 230.
FIG. 3 illustrates the format of one LTE radio frame. Each downlink or uplink LTE radio frame may be 10 milliseconds (ms) long comprising 10 sub-frames of 1 ms each. Each sub-frame may include 2 slots and/or 14 OFDM symbols. A slot may be 0.5 ms long and may include various amounts of LTE data. LTE data may be stored as modulated symbols in sub-carriers within an OFDM symbol. Each modulated symbol in a sub-carrier may typically represent two, four, or six bits. Sub-carriers may be data streams that are spaced 15 kilohertz apart from each other. A sub-carrier may typically carry data at a maximum rate of 15 kilo-symbols per second (ksps). In some embodiments, a LTE downlink sub-frame may typically include multiple resource blocks (RBs) of 12 sub-carriers, each sub-carrier with 14 OFDM symbols. The LTE downlink sub-frame may be partitioned into two equal downlink slots. Each downlink slot may include multiple blocks of 12 sub-carriers with 6 or 7 symbols per sub-carrier (e.g., depending on whether frame uses an extended cyclic prefix or a normal cyclic prefix).
FIG. 3 depicts the timing difference between downlink and uplink data. In some embodiments, Downlink DCI on sub-frame N is for PDSCH data in the same sub-frame. Uplink DCI on sub-frame N has scheduling or grant information for PUSCH in sub-frame N+4. Scheduling information may include grant information indicating certain RF components allocated for transmission or retransmissions for data associated with various UEs. This means that a user device may have about four sub-frames (˜4 ms) from the start of the downlink signal to the start of transmission of the uplink signal. Within this time period, user device needs to perform downlink processing, decode the DCI, send the grant information to a higher layer, where a packet data unit (PDU) is segmented from the radio link control (RLC), get the packet data unit (PDU), also referred to herein as a TB, and do all physical layer uplink processing and perform SC-FDMA modulation for RF transmission on the uplink.
Moreover, an eNode B may demand that the UE advance the timing of its transmitted uplink data, e.g., to accommodate for distance from the UE to the tower. For example, as depicted in FIG. 3, a timing advance may reduce the processing time somewhat, so that the user device has less than 4 milliseconds to perform its processes. For reasons that will be explained below, this timing constraint poses technical challenges not only to designers of user devices but also to designers of test equipment that simulates traffic from multiple user devices. Further, finite hardware and logic resources available for data communications may pose technical challenges for such test equipment when simulating multiple user devices.
Assuming for simplicity a zero time advance, the time available from the start of downlink sub-frame with uplink grant to the start of PUSCH transmission is 4 ms, using the antenna port as the reference point for timing. Table 1, below, shows the steps involved in processing the downlink DCI and the timing budget for each step in the process in one example of a conventional implementation.
TABLE 1Downlink DCI ProcessingSignal path/processing stepTimeTime budget4.0 msRF Reception + Downlink processing−1.5msMAC readies TB for uplink−1.0msUplink control channel processing and uplink TB processing−0.5 msTime remaining for uplink processing:1.0 msUplink part2 + RF Transmission−0.3 msTime remaining for uplink part1:0.7 msAs shown in Table 1, above, the time available for uplink processing (part1 and part2 together) and RF transmission is 1 millisecond.
Uplink part2 processing may include scrambling, mapping data bits to modulation symbols, performing a Discrete Fourier transform (DFT) encoding for SC-FD MA, mapping data to an uplink resource grid, and SC-FDMA signal generation and modulation on to an RF carrier. In some embodiments, since part2 processing may be performed on a block of channel bits for all user devices, the computational complexity and processing time is fixed for various combinations of user devices and different resource allocations for each user device's in a sub-frame. For example, the computational complexity and time for first few steps of part2 processing, such as scrambling and DFT mapping, may be linearly or proportionally based on the aggregate block size for all user devices. Later steps of part2 processing, such as SC-FDMA signal generation, may be performed within a fixed amount of time. Hence, because part2 processing includes steps that are linear in time and/or performed in a fixed amount of time, various cases, including worst case scenarios, may be performed within a particular time constraint imposed by the system or LTE standard (e.g., about a few symbols time or around 0.3 ms). This leaves only 0.7 milliseconds or 700 microseconds for uplink part1 processing.
FIG. 4 illustrates in more detail a portion of the uplink part1 process, shown as the collection of steps 212 in FIG. 2A, according to a conventional implementation. These steps are defined in section 5 of 3rd Generation Partnership Project (3GPP) Technical Specification 36.212, hereinafter referred to as “TS 36.212”, which defines a standard for channel encoding 206, interleaving 208, and rate matching 210 of data and control streams from/to a MAC layer that are encoded/decoded to offer transport and control services over the radio transmission link.
According to section 5.1.3 of TS 36.212, channel encoding 206 may be performed according to the Turbo encoding algorithm, which produces three output bits for every input bit. As defined in section 5.1.4.1 of TS 36.212, each of the three bit streams 400A, 400B, and 400C (herein collectively referred to as bit-streams 400) may or may not include leading NULL bits as padding. In the embodiment illustrated in FIG. 4, the bit streams are NULL padded. Each bit stream 400A, 400B, and 400C is stored into its respective data buffer 402A, 402B, and 402C. These data buffers are herein collectively referred to as pre-interleave buffers 402. Once these buffers are full, each bit stream can be processed by its respective sub-block interleaver 404A, 404B, or 404C, which are herein collectively referred to as interleavers 404. Each sub-block interleaver 404A, 404B, and 404C saves the interleaved bit stream into another data buffer 406A, 406B, or 406C, respectively. These second data buffers are herein collectively referred to as post-interleave buffers 406. The outputs from the sub-block interleavers are sent to rate matcher 210, which collects the interleaved bits and then selects or prunes the collected bits so as to produce a bit stream that is rate matched to the available radio resources to transmit the signal after subsequent steps in the uplink processing chain. Rate matcher 210 collects and selects or prunes bits according to information provided to rate matcher 210 via control signals 408, such as the redundancy value index and other information needed by rate matcher 210.
Each of the sub-block interleavers 404 operates according to the algorithm defined in Section 5.1.4.1.1 of TS 36.212, which involves, for each bit stream, padding the bit stream with leading nulls in order to fully fill a matrix having 32 columns and a variable number of rows, depending on the code block size. The maximum size matrix is 32 columns by 192 row matrix (6144 bits total). First, the matrix is filled row by row. Next, the columns of the matrix are rearranged according to a predefined map. The matrix is then drained column by column. A simplified example of this operation is shown in FIG. 5.
FIG. 5 illustrates an example of a conventional interleaving function, which for simplicity interleaves a block of 16 bits instead of the maximum block of 6144 bits as defined in TS 36.212. The non-interleaved bit stream data is shown occupying 16 contiguous addresses in a buffer memory 500. The bits of the bit stream are represented by variables A through P, and the relative address of each bit is shown to the left of the data. In the example illustrated in FIG. 5 bit A is located in relative address 0, bit B is located in relative address 1, and so on. Bits A through P are loaded 502 into a 4×4 matrix, i.e., the matrix is loaded row by row, from left to right and from top to bottom, resulting in the arrangement of data within the matrix shown as 504. The columns of the matrix are shuffled 506, resulting in the arrangement of data within the matrix shown as 508. The data is then unloaded 510, i.e., read out of the matrix column by column, top to bottom and left to right, and stored into another buffer memory 512. The relative order of the interleaved data is shown in 512: bit D now occupies relative address 0, bit H now occupies relative address 1, and so on. The interleaved data is then rate matched 514, which in this example reduces the number of bits from 16 to 12. The bits selected for output are shown as output 516.
There are disadvantages to the example implementation shown in FIG. 4. TS 36.212 requires that the whole block of data from turbo encoder 206 be ready and waiting in pre-interleave buffers 402 before starting sub-block interleaving. Sub-block interleavers 404 then select the bit sequence as defined by the interleaver function and write to post-interleave buffer 406. For a 6,144 bit code block, the maximum code block size, each of the sub-block interleavers 404 takes 6,144 clock cycles to complete writing the output to post-interleave buffers 408. Since rate matcher 210 can start only after completion of sub-block interleaving, every interleaving process introduces a N clock cycle delay in the uplink data path, where N is the size of the code block. Since there can be multiple code blocks from the transport block segmentation, the same time delay is introduced again for every additional code block.
Uplink part1 processing may be based on uplink control information, including scheduling information that can affect TB size, channel allocation, resource block allocation, type of modulation, and UCI data, among other things. Numerous combinations of these parameters may occur based on scheduling information, which may differ among sub-frames. As such, part1 processing time may vary significantly between sub-frames and transport blocks. For example, an uplink processing device running at 125 MHz clock speed would take about 49 μs to do interleaving process for a 6144 bit code block. For 13 such blocks in a TTI (1 ms) which is a worst case with 1 UE at the maximum data rate, it would take about 639 μs which is a significant amount of time for each TTI. While this timing constraint may be acceptable for a single UE, it poses a significant obstacle to the development of multi-UE emulators or simulators, intelligent traffic generators, eNode B simulators or simulators, and network test equipment. Furthermore, when simulating multiple user devices, another level of complexity may arise as each user device may be associated with independent scheduling information. For multiple UE simulators/emulators, such as traffic emulation systems or test equipment, for example, uplink part1 processing must be duplicated for each UE being emulated.
Another disadvantage of the example implementation shown in FIG. 4 is that it requires both pre-interleave buffers 402 (corresponding to buffer 500 in FIG. 5) and post-interleave buffers 406 (corresponding to buffer 512 in FIG. 5.) Rate matcher 210 then reads data from post-interleave buffers 406, which takes N clock cycles where N is the number of channel bits. These serial operations are time consuming and require multiple memories on a single UE. These disadvantages are multiplied for multiple UE emulation systems.
For these reasons it is difficult to meet the timing requirements for all configurations when using a conventional implementation such as the one shown in FIG. 4. Table 2, below shows the time required for steps of uplink part1 processing in a multi-UE emulation system using the example conventional implementation shown in FIG. 4 for one specific case:
TABLE 2Uplink part1, conventional implementationSignal path/processing stepTimeTurbo encoder processing500.00 μsMemory processing0.01μsSub-block interleaving300.00 μsSub-block memory processing0.01 μsRate matching100.00 μsTime required for Uplink part1 processing:900.02μs
As shown in Table 2, above, in some specific cases—depending on the sizes of the transport blocks, the number of UEs being emulated, and so on—the time required to perform uplink part1 processing using the conventional implementation may exceed the timing budget. In the specific case shown in Table 2, for example, uplink part1 processing took 900.02 microseconds, longer than the available 700 microseconds.
Accordingly, in light of these disadvantages associated with conventional architectures, there exists a need for methods, systems, and computer readable media for fast, reduced memory and integrated sub-block interleaving and rate matching.