Many communications systems perform forward error correction (FEC) to improve data transmission accuracy and to ensure data integrity. FEC helps reduce bit error rates (BER) in applications such as data storage, digital video broadcasts, and wireless communications. Reed-Solomon (RS) error-correcting codes are commonly used for FEC.
Referring now to FIG. 1, a first device 10-1 communicates with a second device 10-2 over a communications channel 12. The communications channel can be hardwired or wireless. For example, the communications channel 12 can be an Ethernet network, a wireless local area network, a bus for a hard drive, etc. The first device 10-1 includes components 14-1 that output signals to a RS encoder 16-1 and that receive signals from a RS decoder 18-1. Likewise, the device 10-2 includes components 14-2 that output signals to a RS encoder 16-2 and that receive signals from a RS decoder 18-2. The components 14-1 of the first device 10-1 may be similar to or different than the components 14-2 of the second device 10-2. The encoders 16 encode the data before the data is output onto the communications channel 12. The encoders 16 insert redundant bits into the data stream. The decoders 18 use the redundant bits to detect and, when possible, to correct errors in the received data.
Referring now to FIG. 2, steps that are performed by a RS decoder are shown generally at 20. In step 22, the RS decoder computes syndrome values. In step 24, the RS decoder computes an error locator polynomial. The error locator polynomial can be calculated using a Berlekamp-Massey algorithm (BMA), inversionless BMA (iBMA), Euclidean algorithm, or other suitable algorithms. In step 26, the Reed-Solomon decoder calculates an error evaluator polynomial, which is typically based on the syndrome values and the error locator polynomial.
In step 28, the RS decoder finds error locations. For example, Chien's search algorithm, which will be described below, can be used. In step 30, error values are found. For example, Forney's algorithm, which will be described below, is often used to find the error values. Steps 24 and 28 may be performed in parallel in hardware implementations.
Referring now to FIG. 3, a RS decoder 32 typically includes a syndrome calculator 34 and an error locator polynomial generator 36. The Reed-Solomon decoder 32 also includes an error evaluator polynomial generator 38, an error location finder 40 and a error value finder 42. Control devices 44 and storage devices 46 may also be used to control decoding and to store data values for use by the RS decoder 32. The RS decoder 32 can be implemented using register-based VLSI, software and a processor, an application specific integrated circuit (ASIC), or in any other suitable manner.
In “High-speed Decoding of BCH Codes Using a New Error-Evaluation Algorithm,” T. Horiguchi, Electronics and Comm. in Japan, vol. 72, no. 12 (1989), an error evaluator for RS codes is computed during the BMA iterations. In “On the Determination of Error Values For Codes From a Class of Maximal Curves,” R. Koetter, Proc. 35th Annual Allerton Conference on Communications, Control and Computing, Univ. of Illinois at Urbana-Champaign (1997), an error evaluator for algebraic geometry codes is disclosed. The error evaluator in Koetter reduces to the error evaluator in Horiguchi for RS codes.
The error evaluators in Horiguchi and Koetter require the BMA to be formulated in a manner that cannot be implemented in hardware easily. Berlekamp's formulation of the BMA has a more regular structure and can be readily implemented in hardware. A variation of the error evaluator for Berlekamp's formulation of the BMA is also disclosed in “On Decoding Reed-Solomon Codes up to and beyond the Packing Radii,” W. Feng, Ph. D. dissertation, Univ. of Illinois at Urbana-Champaign (1999).
A typical BMA employs a Galois field inverter, which computes a multiplicative inverse of Galois field elements. An inversionless (or division-free) BMA (iBMA) eliminates the Galois field inverter. One advantage of iBMA is the reduced delay of a critical path of a VLSI implementation.
RS codes operate on finite fields (GF(2m)). GF(q) is a Galois field with q elements. c=(co, cl, . . . , cn−1) is a vector of length n over GF(q) where n=q−1The Fourier transform of the vector c is C=(C0, C1, . . . , Cn−1), where
            C      j        =                  ∑                  i          =          0                          n          -          1                    ⁢                        c          i                ⁢                  α          ij                      ,      j    =    0    ,  1  ,      …    .    ⁢          ,      n    -    1  and α is a primitive element of GF(q). A t-error-correcting RS code is the collection of all vectors c with a Fourier transform satisfying Cm0=Cm0+1= . . . =Cm0+2t−1=0 for some integer m0. The code has minimum Hamming distance dmin=2t+1. Note that a RS code can be shortened to a length n<q−1 if needed.
A vector c=(c0, c1, . . . , cn−1) can also be represented as a polynomial c(x)=
      c    ⁡          (      x      )        ⁢            ∑              i        =        0                    n        -        1              ⁢                  ⁢                  c        i            ⁢                        x          i                .            The Fourier transform is a polynomial evaluation at x=α0, α1, . . . , αn−1. In other words, Cj=c(αj). A t-error-correcting RS code is the collection of all vectors c such that c(αm0)=c(αm0+1)= . . . =c(αm0+2t−1)=0. Therefore, every codeword is a multiple ofg(x)=(x−αm0)(x−αm0+1) . . . (x−αm0+2t−1).The polynomial g(x) is a generator polynomial of the Reed-Solomon code.
v=c+e is the received vector where e=(e0, e1, . . . , en−1) is the error vector and ei≠0 when there is an error at the ith position. The codeword c and the error vector e are not known. The received vector v and the Fourier transform of c satisfy Cm0= . . . =Cm0+2t−=0. V and E are the Fourier transforms of v and e, respectively. Because the Fourier transform is a linear transformation,Vi=Ci+Ei for i=0,1, . . . ,n−1.Because Ci=0 for i=m0, m0+1, . . . , m0+2t−1, it follows that:Ei=Vi for i=m0,m0+1, . . . ,m0+2t−1.Therefore, the decoder can compute Ei from the received vector for i=m0, . . . , m0+2t−1. Ei are denoted as Si=Ei+m0 for i=0, . . . , 2t−1 and are called syndromes.
The RS decoder computes e from the 2t syndromes. The task can be divided into two parts. First, the error locations are found. In other words, the decoder finds all i such that ei≠0. Second, the error values ei are found for all of the error locations. Then, c can be recovered by subtracting e from v.
The RS decoder finds the error locator polynomial, which is defined as a polynomial Λ(x) satisfying:Λ(0)=1; and Λ(α−1)=0 if and only if ei≠0.The error locations are obtained by finding the zeros of Λ(x). To determine the error values, a decoder usually computes the error evaluator polynomial, which is given by Γ(x)=Λ(x)S(x) mod x2t where S(x)=
      S    ⁡          (      x      )        =            ∑              i        =        0                              2          ⁢                                          ⁢          t                -        1              ⁢                  S        i            ⁢              x        i            Sixi is the syndrome polynomial.
If there are at most t errors, the error values can be determined (as set forth in G. D. Forney, Jr., “On Decoding BCH Codes,” IEEE Trans. Inform. Theory, vol. 11, pp. 547–557 (1965), which is hereby incorporated by reference):
      e    i    =      {                            0                                                    if              ⁢                                                          ⁢                              Λ                ⁡                                  (                                      α                                          -                      i                                                        )                                                      ≠            0                                                                                                                                x                                          m                      0                                                        ⁢                                      Γ                    ⁡                                          (                      x                      )                                                                                        x                  ⁢                                                                          ⁢                                                            Λ                      ′                                        ⁡                                          (                      x                      )                                                                                                                        x              -                              α                                  -                  i                                                                                                        if              ⁢                                                          ⁢                              Λ                ⁡                                  (                                      α                                          -                      i                                                        )                                                      =            0                              where Λ′(x) is the formal derivative of Λ(x). This formula is known as the Forney algorithm.
In a first decoding step, assuming the received vector is fed into the decoder symbol by symbol, Homer's rule can be used to reduce complexity. In other words, Ej is evaluated asEj=( . . . ((vn−1αj+vn−2)αj)+ . . . +v1)αj+v0.Note that αn=1 and that only Sj=Em0+j for j=0, 1, . . . , 2t−1 are needed. In a second decoding step, the error locator polynomial is calculated from the syndromes using any suitable algorithm such as Peterson-Gorenstein-Zierler algorithm, the BMA, and the Euclidean algorithm.
In a third decoding step, two approaches are generally used for computing the error evaluator polynomial if the BMA is used. A first approach computes the error evaluator polynomial simultaneously with the error locator polynomial using an iteration algorithm similar to the BMA. For example, see E. R. Berlekamp, Algebraic Coding Theory, New York, McGraw-Hill (1968). The cost of implementing this approach is high. A second approach computes the error evaluator polynomial after computing the error locator polynomial. The second approach is less complex than the first approach. However, decoding latency of the second approach is higher than the first approach.
In a fourth decoding step, Λ(x) and Γ(x) are known. The error values are determined using Forney's algorithm. Chien's search, which is used to find the zeros of the error locator polynomial Λ(x), is typically performed in parallel with the error evaluation in hardware implementations.
When using the BMA, a linear-feedback shift register (Λ, L) with length L and coefficients Λ0=1, Λ1, . . . , ΛL produces the sequence S0, S1, S3, . . . if:
            S      j        =                  -                              ∑                          i              =              1                        L                    ⁢                                    Λ              i                        ⁢                          S                              j                -                i                                      ⁢                                                  ⁢            for            ⁢                                                  ⁢            j                              =      L        ,      L    +    1    ,      …    ⁢                  .  If (Λ, L) produces S0, . . . , Sr−1 but does not produce S0, . . . , Sr−1, Sr, then (Λ,L) has a discrepancy Λr at Sr, where
      Δ    r    =            ∑              i        =        0            L        ⁢                  Λ        i            ⁢                        S                      r            -            i                          .            Λ(x)=
      Λ    ⁡          (      x      )        =            ∑              i        =        0            L        ⁢                  Λ        i            ⁢              x        i            is the feedback polynomial. Note that the degree of Λ(x) is less than or equal to L because ΛL might be zero.
The BMA finds the shortest linear feedback shift register that produces a given sequence. If the number of errors is less than or equal to t, the error locator polynomial is the polynomial of lowest degree that produces the syndrome sequence S0, S1, . . . , S2t−1. If the number of errors is more than t, the BMA still finds the polynomial of lowest degree that generates the syndrome sequence. However, this polynomial is usually not a locator polynomial and the decoding algorithm fails.
Referring now to FIG. 4, one BMA implementation that is disclosed in Horiguchi and in J. L. Massey, “Shift-Register Synthesis and BCH Decoding,” IEEE Trans. Inform. Theory, vol. 15, no. 1 (1969), which is hereby incorporated by reference is shown. In step 50, variables are initialized (Λ(x)←1, B(x)←1, p←1, a←0, r←0, L←0, ΔB←1). In step 54, control determines whether r=2t. If true, control ends in step 58. Otherwise, the discrepancy
  Δ  ←            ∑              i        =        0            L        ⁢                  Λ        i            ⁢              S                  r          -          i                    is computed in step 60.
If Δ=0 in step 62, then p←p+1 in step 64 and control continues with step 68. Otherwise, if Δ≠0 and 2L>r as determined in step 70, then Λ(x)←Λ(x)−ΔΔB−1 xp B(x) in step 72 and control continues with step 64. If Δ≠0 and 2L≦r as determined in step 76, then T(x)←Λ(x), Λ(x)←Λ(x)−ΔΔB−1 xp B(x), B(x)←T(x), a←r←2L, ΔB←Δ, L←r+1−L, and p←1 in step 80. Control continues from steps 64, 76 and 80 with step 68 where r←r+1. Control continues from step 68 to step 54 until control ends in step 58. The parameter a in this algorithm is not disclosed in Massey. Parameter a was introduced by Horiguchi and is used for error evaluation.
u≦t is the number of errors. Xl=αil and Yl are the error positions and the error values, respectively (that is Λ(α−il)=0, i=1, . . . , u). At the end of this algorithm:
            B      ^        ⁡          (      x      )        =            Δ      B              -        1              ⁢          x      a        ⁢                  B        ⁡                  (          x          )                    .      The error values are given by Horiguchi as follows:
      Y    l    =                    X        l                                            -              2                        ⁢                                                  ⁢            u                    -                      m            0                    +          3                                                  B            ^                    ⁡                      (                          X              l                              -                1                                      )                          ⁢                              Λ            ′                    ⁡                      (                          X              l                              -                1                                      )                                .  The structure of this formulation is not easily implemented in register-based VLSI in which the coefficients of Λ(x) and B(x) are stored in registers. When updating the polynomial Λ(x) in steps 76 and 80, the relationship between the coefficients of Λ(x) and B(x) depends on the variable p. In other words, Λi=Λi−ΔΔB−1Bi-p, where p could be 1, or 2, or 3, . . . x. This means that the circuit connections are not fixed. While multiplexers can be used to avoid this problem, this approach is not sufficiently cost effective.
Referring now to FIG. 5, Berlekamp's formulation with a minor modification has a more regular structure and is desirable for register-based VLSI implementations. The algorithm is disclosed in R. E. Blahut, Theory and Practice of Error Control Codes, Reading, Mass., Addison-Wesley Publishing Company (1983), which is hereby incorporated by reference. The algorithm is similar to Berlekamp's formulation.
The syndrome sequence S0, S1, . . . , S2t−1, and t are used as inputs. In step 100, initialization of variables is performed (Λ(x)←1, B(x)←1, r←0, L←0). In step 104, if r=2t, control ends in step 108. Otherwise, the discrepancy
  Δ  ←            ∑              i        =        0            L        ⁢                  Λ        i            ⁢              S                  r          -          i                    is computed in step 110. If Δ≠0 and 2L≦r in step 112, then δ←1, L←r+1−L in step 114. Otherwise, δ←0 in step 116. In step 120, the polynomials are updated as follows:
      (                                        Λ            ⁡                          (              x              )                                                                        B            ⁡                          (              x              )                                            )    ←            (                                                  1              -                              Δ                ⁢                                                                  ⁢                x                                                                                                        Δ                                  -                  1                                            ⁢                              δ                ⁡                                  (                                      1                    -                    δ                                    )                                            ⁢              x                                          )        ⁢          (                                                  Λ              ⁡                              (                x                )                                                                                        B              ⁡                              (                x                )                                                        )      In step 124, r←r+1 and control continues with step 104.
At each clock cycle, the register bank for B(x) either shifts right Bi←Bi−1 corresponding to B(x)←xB(x), or parallel loads Bi←Δ−1Λi corresponding to B(x)←Δ−1Λ(x). The register bank for Λ(x) is updated by Λi←Λi−ΔBi−1. Note that this algorithm does not provide an error evaluator. In W. Feng, “On Decoding Reed-Solomon Codes Up to and Beyond the Packing Radii,” Reading, Ph.D. dissentation, Univ. of IL at Urbana-Champaign (1999), which is hereby incorporated by reference, an error evaluator for this BMA formulation was derived as follows:
      e    i    =      {                            0                                                                    if                ⁢                                                                  ⁢                                  Λ                  ⁡                                      (                                          α                                              -                        i                                                              )                                                              ≠              0                        ,                                                                          α                              -                                  i                  ⁡                                      (                                                                  m                        0                                            +                                              2                        ⁢                                                                                                  ⁢                        t                                            -                      2                                        )                                                                                                                        B                                      (                                          2                      ⁢                                                                                          ⁢                      t                                        )                                                  ⁡                                  (                                      α                                          -                      i                                                        )                                            ⁢                                                Λ                  ′                                ⁡                                  (                                      α                                          -                      i                                                        )                                                                                                                        if                ⁢                                                                  ⁢                                  Λ                  ⁡                                      (                                          α                                              -                        i                                                              )                                                              =              0                        ,                              or equivalently:
            Y      l        =                  X        l                  -                      (                                          m                0                            +                              2                ⁢                                                                  ⁢                t                            -              2                        )                                                            B                          (                              2                ⁢                                                                  ⁢                t                            )                                ⁡                      (                          X              l                              -                1                                      )                          ⁢                              Λ            ′                    ⁡                      (                          X              l                              -                1                                      )                                ,where B(2t)(x) is the scratch polynomial at the end of 2t iterations of the algorithm shown in FIG. 5. This evaluator is functionally equivalent to the Horiguchi-Koetter evaluator.
Referring now to FIG. 6, the iBMA is a modification of the algorithm in FIG. 5. The syndrome sequence S0, S1, . . . , S2t−1 and t are inputs. In step 130, Λ(x)←1, B(x)←1, r←0, L←0, ΔB+1. In step 134, if r=2t, control ends in step 136. Otherwise in step 138, the discrepancy
  Δ  ←            ∑              i        =        0            L        ⁢                  Λ        i            ⁢              S                  r          -          i                    is computed. If Δ≠0 as determined in step 142, Λ(x)←ΔBΛ(X)+ΔxB(x) in step 144. Otherwise, Λ(x)←Λ(x) in step 146. If Δ≠0 and 2L≦r as determined in step 146, B(x)←Λ(x), L←r+1−L and ΔB←Λ in step 148. Otherwise, B(x)←xB(x) in step 150. In step 154, r←r+1. Control continues from step 154 to step 134.
Note that the Λ(x) produced by the algorithm in FIG. 6 is a constant multiple of the ←(x) produced by the algorithm of FIG. 5 and no longer satisfies the condition of ←(0)=1. Λ(x) is still referred to as the error-locator polynomial because it still “locates” the errors. In other words, the roots of this polynomial point to the error locations if the number of errors in the received vector is less than or equal to the correction power t.
As can be appreciated from the forgoing, the steps performed by RS decoders can be complex and can involve a large number of calculations. Reducing the number of calculations and increasing the speed of RS decoding would be desirable.