Linear Feedback Shift Registers (LFSRs) are quite common for numerous applications. LFSRs, amongst other things, are used as encoders of BCH codes which are either used for error correction by themselves or as component codes of larger error correction codes.
In many cases, it is required that an LFSR will be generic with programmable taps. For example, if it is required that we have a configurable BCH encoder, we may need to have programmable taps on the LFSR. The following describes an efficient method for constructing a programmable LFSR that can work at higher frequencies and still maintain full generality.
FIG. 1 shows an example of an LFSR 10 that is used as an encoder in a BCH code. The number of memory cells (D 13) in the LFSR is determined by the length of the code redundancy. Plain data (din) comes in at the “din” input and shifts through the memory cells 13 where feedback from the last cell is added to the data being shifted in. After the last bit has entered the LFSR, the switches “A” and “B” 11 and 18 respectively are switched to the zero state as to disable the feedback and prevent additional data coming in. The remaining data in the shift register is then shifted out and forms the code redundancy.
The description of the encoding in FIG. 1 is not optimal but is only intended to demonstrate how an LFSR is used in an encoder.
FIG. 2 shows a generalized encoder scheme where the taps can be chosen arbitrarily. At the input to each memory cell 13, there is an adder 12 between the data from previous memory cell (or new data) and possible feedback from the first memory cell. A value of a programmable memory cell g_i 15 determines if there will be an added feedback to LFSR memory cell i. By setting the memory cells g_i to different values, we can modify the LFSR function to support any polynomial (with maximal degree equals to the LFSR length).
For example, we can set the g_i memory cells to hold the taps corresponding to a BCH code that corrects upto 3 errors, based on GF(2^9) or a BCH code that corrects upto 4 errors and based on GF(2^10). Each case has a different amount of active taps (27 Vs 40). This amount corresponds to the redundancy length. The length of the LFSR (the number of memory cell) will be designed for the longest redundancy case we expect. When shorter redundancies are expected, the first taps g0, g1, . . . will be zero and will have no effect on the feedback. Basically, the first memory cells will only act as a delay line, before a shorter LFSR.
This LFSR can also be defined in a mathematical manner as an operation on a polynomial. We define the plain data by the following polynomial:
      D    ⁡          (      x      )        =            ∑              i        =        0                    k        -        1              ⁢                  d        i            ·              x                  i          +          r                    
Where dk−1 is the first data bit to enter the encoder and d0 is the last bit into the encoder. The power Xi+r at each element represents the corresponding element di position in time. In general, multiplying the polynomial D(x) by x represents a shift in time.
The operation of the LFSR can be represented by a modulo function. A modulo by the taps polynomial defined by
      g    ⁡          (      x      )        =            ∑              i        =        0                    v        -        1              ⁢                  ⁢                  g        i            ·              x        i            
Therefore, the redundancy can be described by the polynomial
      r    ⁡          (      x      )        =                    ∑                  i          =          0                          v          -          1                    ⁢                          ⁢                        r          i                ·                  x          i                      =                  D        ⁡                  (          X          )                    ⁢      mod      ⁢                          ⁢              g        ⁡                  (          x          )                    
We can use the above equation to write a relation between the contents of the LFSR at time k and time k+1 as shown below:yi(k+1)=yi−1(k)+gi·yv−1(k),i=0 . . . v−1,k≥0
Where yi(j) is the value of memory cell i at time j in the LFSR and y−1(j)=dk−j.
The above equation can be applied recursively to obtain the values of the LFSR after 2, 3 or more shifts.
It is noted this recursive equation can be implemented by an LFSR that may process multiple bits in parallel—e.g. the LFSR content may advance 4 shifts simultaneously at each clock.
FIG. 3 illustrates a prior art LFSR 20. LFSR 20 is configured to process four bits in parallel. It is fed by four bits per cycle Din[0]-Din[3]. It includes (N+1) stages—each stage includes logic 30 and multiple memory cells—and feedback logic 50. First stage memory cells are denoted 40(0,1)-40(0,4), the second stage memory cells are denoted 40(1,1)-40(1,4) and the N'th stage memory cells are denoted 40(N,1)-40(N,4). The feedback logic 50 outputs four feedback signals F[0]-F[4] 80(1)-80(4) to each logic 30 of each of the (N+1) stages. Especially—these four feedback signals are fed to each logic subset (such as 30(1)) that is connected between each pair of memory cells of successive stage—such as between memory cell LFRS[n] 40(0,1) and memory cell LFRS[n+4] 40(1,1). Especially, logic subset 30(1) includes four AND gates and a XOR gate that has five inputs—four inputs for receiving the output signals of the four AND gates and one for receiving the output signal of the memory cell LFRS[n] 40(0,1). The four AND gates perform four AND operations between each one of four feedback signals F[0]-F[4] 80(1)-80(4) and each one of polynomial coefficients Poly[n+1]-Poly[n+4] 70(1)-70(4).
FIG. 4 illustrates the memory cell of the last stage of the LFSR (40(N,1)-40(N,4) and feedback logic 50 of FIG. 3. Each one of four feedback signals F[0]-F[4] 80(1)-80(4) is generated by a separate branch of feedback logic 50.
The length of the longest branch of the feedback logic increases with the amount of bits that are managed in parallel to each other. The first branch is merely a line from the output port of memory cell 40(N,1). The second branch includes AND gate 51 and XOR gate 52 and has a latency of two gates. The third branch includes two AND gates 51 and two XOR gates 52 and has a latency of four gates. The fourth branch includes three AND gates 51 and three XOR gates 52 and has a latency of six gates.
It is apparent that the more shifts are handled at once, the more logic is needed, and therefore the lower the limit on the clock frequency.
FIG. 4 is not the only method for performing multiple LFSR shifts in one clock but the basic problem remains. The higher the number of shifts the lower is the limit on the clock frequency.
The following describes a method allowing multiplying the LFSR shifts without limiting the clock frequency.