A linear feedback shift register (LFSR) is commonly utilized for implementing components such as, for example, scramblers, descramblers, cyclic redundancy check (CRC) devices, along with assisting with turbo-encoding in communication systems. As communications systems become faster, however, traditional hardware implementations of LFSR's have become dated and require improvement. Hardware implementations are not flexible because each LFSR needs to be mapped to a different hardware implementation.
Software implementations of LFSR's have become increasingly important in the filed of software-defined-radio. Functions that were traditionally defined in hardware are now implemented using software running on a computer device. Processors, however, are often ill-equipped to deal with LFSR. LFSR's are computed bit by bit. Therefore, many cycles are often needed to produce a single step state transition corresponding to a single bit output. One solution to this problem is a table lookup approach, which provides a small increase in efficiency. However, this method is limited because there is an exponential increase in computational cost as the size of the lookup table increases.
Another approach to LFSR computation efficiency improvement is to pre-compute a k-step state transition matrix and output generating matrix. This allows multiple state transitions and multiple output bits to be generated in a single cycle. In general, the k-step state transition matrix and k-step output generating matrix are combined to form a single matrix of size (L+k)*(L+k), wherein L is the number of state bits of the LFSR. The state bits and input bits are used to form a single state-input vector (SIV). The combined matrix is then multiplied by the SIV to produce the next state and output.
While this approach can provide significant improvement in efficiency, it is still highly limited by the fixed data width of the processor. Generally, it is desirable to keep L+k≦w, wherein w is the fixed data path width, in order to limit the impact on the processor's architecture redesign needs and to ease programming burdens. As such, the potential speedup is limited, particularly when L (representing the number of state bits or length of the LFSR) approaches the data path width. In this case, only small efficiency improvements can be achieved by pre-computing the matrices. In addition, when the above matrix approach is implemented as a standalone hardware accelerator, matrix size is critical, as it directly relates to the complexity and cost, and therefore computing efficiency, of the implementation.
It is therefore necessary to develop a method for reducing the size of the combined state transition matrix and output generating matrix to provide improved efficiency in LFSR computations.