It will be recalled that channel “block encoding” consists, when the “codewords” sent to a receiver or recorded on a data carrier are formed, of introducing a certain level of redundancy in the data. More particularly, by means of each codeword, the information is transmitted that is initially contained in a predetermined number k of symbols taken from an “alphabet” of finite size q; on the basis of these k information symbols, calculation is made of a number n>k of symbols belonging to that alphabet, which constitute the components of the codewords: v≡(v0,v1, . . . , vn−1) (the symbol “≡” means “by definition”). The set of codewords obtained when each information symbol takes some value in the alphabet constitutes a sort of dictionary referred to as a “code” of “dimension” k and “length” n.
When the size q of the “alphabet” is a power of a prime number, the alphabet can be given the structure of what is known as a “Galois field” denoted Fq, of which the non-zero elements may conveniently be identified as each being equal to γi for a corresponding value of i, where i=1, . . . , q−1, and where γ is a primitive (q−1)th root of unity in Fq.
In particular, certain codes, termed “linear codes” are such that any linear combination of codewords (with the coefficients taken from the alphabet) is still a codeword. These codes may conveniently be associated with a matrix H of dimension (n−k)×n, termed “parity check matrix”: a word v of given length n is a codeword if, and only if, it satisfies the relationship: H·vT=0 (where the exponent T indicates the transposition); the code is then said to be “orthogonal” to the matrix H.
At the receiver, the associated decoding method then judiciously uses this redundancy to detect any transmission errors and if possible to correct them. There is a transmission error if the difference e between a received word r and the corresponding codeword v sent by the transmitter is non-zero.
More particularly, the decoding is carried out in two main steps.
The first step consists of associating an “associated codeword” with the received word. To do this, the decoder first of all calculates the vector of “error syndromes” s≡H·rT=H·eT. If the syndromes are all zero, it is assumed that no transmission error has occurred, and the “associated codeword” will then simply be taken to be equal to the received word. If that is not the case, it is thereby deduced that the received word is erroneous, and a correction algorithm is then implemented which is adapted to estimate the value of the error e; the algorithm will thus provide an estimated value ê such that (r−ê) is a codeword, which will then constitute the associated codeword. Usually, this first step is divided into two substeps: first identification is made of what the components are in the received word whose value is erroneous, and then the corrected value of those components is calculated.
The second step simply consists in reversing the encoding method. In the ideal situation in which all the transmission errors have been corrected, the initial information symbols are thereby recovered.
It will be noted that in the context of the present invention, reference will often be made to “decoding” for brevity, to designate solely the first of those steps, it being understood that the person skilled in the art is capable without difficulty of implementing the second step.
The purpose of an error correction algorithm is to associate with the received word the codeword situated at the shortest Hamming distance from that received word, the “Hamming distance” being, by definition, the number of places where two words of the same length have a different symbol. The shortest Hamming distance between two different codewords of a code is termed the “minimum distance” d of that code. This is an important parameter of the code. More particularly, it is in principle possible to find the position of the possible errors in a received word, and to provide the correct replacement symbol (i.e. that is identical to that sent by the transmitter) for each of those positions, each time the number of erroneous positions is at most equal to INT[(d−1)/2] (where “INT” designates the integer part) for a code of minimum distance d (for certain error configurations, it is sometimes even possible to achieve better). However, in all cases, the concern is not with a possibility in principle, since it is often difficult to develop a decoding algorithm achieving such performance. It should also be noted that, when the chosen algorithm manages to propose a correction for the received word, that correction is all the more reliable (at least, for most transmission channels) the smaller the number of positions it concerns.
The capability of a correction algorithm to propose a correction of a received word is faithfully represented by the formula:2t≦Δ, où t where Δ is the number of erroneous symbols in the received word, and is a strictly positive integer which we will call the “solving capability” of the algorithm. If the value of (2t) is less than or equal to the solving capability, the correction algorithm will be capable of correcting the received word. If the value of (2t) is greater than the solving capability, the algorithm can:                either simply fail in its correction attempt,        or be capable of proposing a correction of the received word; in this case, if that correction is accepted, the risk is taken of it being erroneous, i.e. that the codeword proposed is not in fact the word sent; clearly, the greater (2t) is with respect to Δ, the higher the risk.        
Taking into account the above considerations concerning the minimum distance d of the code, the algorithm considered will be said to be “maximum” ifΔ=d−1,and “sub-maximum” ifΔ<d−1.
Among known codes, “Reed-Solomon” codes may be cited, which are reputed for their efficiency. They are linear codes, of which the minimum distance d is equal to (n−k+1). The parity check matrix H of the Reed-Solomon code of dimension k and length n (where n is necessarily equal to (q−1) or a divisor of (q−1)) is a matrix with (n−k) lines and n columns, which has the structure of a Vandermonde matrix. This parity check matrix H may for example be defined by taking Hij=α(i+1)j (0≦i≦n−k−1, 0≦j≦n−1), where α is an nth root of unity in Fq. For more details on Reed-Solomon codes, reference may for example be made to the work by R. E. Blahut entitled “Theory and practice of error-control codes”, Addison-Wesley, Reading, Mass., 1983. If the parity check matrix H of dimension (n−k)×n of a Reed-Solomon code is replaced by a matrix H′ obtained by deleting certain columns of H, the code orthogonal to H′ is said to be a “shortened” Reed-Solomon code.
For modern information carriers, for example on computer hard disks, CDs (“compact discs”) and DVDs (“digital video discs”), it is sought to increase the density of information. When such a carrier is affected by a physical defect such as a scratch, a high number of information symbols may be rendered unreadable. This problem may nevertheless be remedied by using a very long code. However, as indicated above, the length n of the words in Reed-Solomon codes is less than the size q of the alphabet of the symbols. Consequently, if a Reed-Solomon code is desired having codewords of great length, high values of q must be envisaged, which leads to costly implementations in terms of calculation and storage in memory. Moreover, high values of q are sometimes ill-adapted to the technical application envisaged. For this reason, it has been sought to build codes which naturally provide words of greater length than Reed-Solomon codes.
In particular so-called “algebraic geometric codes” or “Goppa geometric codes” have recently been proposed (see for example “Algebraic Geometric Codes” by par J. H. van Lint, in “Coding Theory and Design Theory” 1st part, IMA Volumes Math. App., volume 20, Springer-Verlag, Berlin, 1990). These codes are constructed from a set of n pairs (x, y) of symbols belonging to a chosen Galois field Fq; this set of pairs is termed a “locating set”. In general terms, there is an algebraic equation with two unknowns X and Y such that the pairs (x, y) of that locating set are all solutions of that algebraic equation. The values of x and y of these pairs may be considered as coordinates of points Pj (where j=1, . . . , n) forming an “algebraic curve”. Furthermore, in the context of the present invention, the set of the pairs (x, y) having the same value of x will be said to constitute an “aggregate”.
An important parameter of such a curve is its “genus” g. In the particular case where the curve is a simple straight line (the genus g s then zero), the algebraic geometric code reduces to a Reed-Solomon code. In certain cases, algebraic geometric codes make it possible to achieve a length equal to (q+2g√{square root over (q)}), which may be very high; for example, with an alphabet length of 256 and a genus equal to 120, codewords are obtained of length 4096.
For a “one point” algebraic geometric code a parity check matrix is conventionally defined as follows: With every monomial h≡XsYt, where s and t are positive integers or zero, a “weight” is associated (see below for details). If, for an integer ρ≧0, there is at least one monomial of which the weight is ρ, it is said that ρ is an “achievable” weight. Let ρ1<ρ2< . . . <ρn-k be the (n−k) smallest achievable weights, and let hi (where i=1, . . . , n−k) be a monomial of weight ρi. The element in line i and column j of the parity check matrix is equal to the monomial hi evaluated at the point Pj (where, it may be recalled, j=1, . . . , n) of the algebraic curve. Each point Pj then serves to identify the jth component of any codeword.
Algebraic geometric codes are advantageous as to their minimum distance, and, as has been said, as to the length of the codewords, but they have the drawback of requiring decoding algorithms that are rather complex, and thus rather expensive in terms of equipment (software and/or hardware) and processing time. This complexity is in fact greater or lesser according to the algorithm considered, a greater complexity being in principle the price to pay for increasing the error correction capability of the decoder (see for example the article by Tom Høholdt and Ruud Pellikaan entitled “On the Decoding of Algebraic-Geometric Codes”, IEEE Trans. Inform. Theory, vol. 41 no. 6, pages 1589 to 1614, November 1995).
It should be noted that for these algorithms, most often only a lower bound of their solving capability Δ is available, except in the “trivial” case of a maximum algorithm for correction of Reed-Solomon codes, called the “Berlekamp-Massey algorithm”, for which the solving capability is precisely known and is equal to Δ=n−k.
An algorithm for decoding algebraic geometric codes defined on a curve of non-zero genus, termed “basic” algorithm, has been proposed by A. N. Skorobogatov and S. G. Vl{hacek over (a)}dut in the article entitled “On the Decoding of Algebraic-Geometric Codes”, IEEE Trans. Inform. Theory, vol. 36 no. 5, pages 1051 to 1060, November 1990). That algorithm comprises:                a) constructing a “syndromes matrix” S of dimension (n−k)×(n−k), of which each coefficient Sij, where j is less than or equal to a “boundary” value w(i), is equal to a judiciously chosen linear combination of the elements sv (v=1,2, . . . , n−k) of the syndrome s, the coefficients Sij beyond the boundary being undetermined; it is conveniently arranged for the order of the lines of that syndromes matrix S to be such that the function w(i) is decreasing, that is to say that w(i)≧w(i+1) for all i=1,2, . . . , n−k−1;        b) considering the system of linear equations        
                                                        ∑                              i                =                1                            β                        ⁢                                          l                i                            ⁢                              S                ij                                              =          0                ,                              for            ⁢                                                  ⁢            j                    =          1                ,        2        ,        …        ⁢                                  ,                  w          ⁡                      (            β            )                          ,                            (        1        )            where the unknowns li belong to the same alphabet of symbols as the elements of the codewords, and where β is an integer between 1 and (n−k) such that the system permits a non-trivial solution (that is to say a solution in which the coefficients li are not all zero), and determining the values of the coefficients li corresponding to the smallest possible value of β, which will be denoted λ;                c) calculating the roots of the “error-locating polynomial”        
                                          Λ            ⁡                          (                              x                ,                y                            )                                ≡                                    ∑                              i                =                1                            λ                        ⁢                                          l                i                            ⁢                                                h                  i                                ⁡                                  (                                      x                    ,                    y                                    )                                                                    ,                            (        2        )            these roots comprising all the pairs (x,y) corresponding to positions of the received word for which the component in that position has suffered a transmission error; and                d) correcting the erroneous symbols of the received word of which the position is now known.        
The “basic” algorithm guarantees a solving capability at least equal to Δ=n−k−2g. However, the minimum distance d for an algebraic geometric code is at least equal to (n−k+1−g). It is thus clear that the basic algorithm is “sub-maximum”, and this is all the more so the greater the genus g of the algebraic curve.
With the aim of improving the solving capability, Skorobogatov and Vl{hacek over (a)}dut proposed, in the same article cited above, a “modified” version of the “basic” algorithm. This “modified” algorithm has a solving capability at least equal to Δ=n−k−g−s, where s is a parameter dependent on the algebraic curve chosen, which may furthermore sometimes be zero (this is the case for example for so-called “hyperelliptic” algebraic curves.
Algorithms are also known (which may be maximum or sub-maximum according to the manner in which they are implemented) which operate according to an iterative principle: each new iteration of such an algorithm uses an additional component of the syndromes vector s≡H·r{dot over (T)}.
An example of such an iterative decoding algorithm is disclosed in the article by M. Sakata et al. entitled “Generalized Berlekamp-Massey Decoding of Algebraic-Geometric Codes up to Half the Feng-Rao Bound” (IEEE Trans. Inform. Theory, vol 41, pages 1762 to 1768, November 1995) This algorithm can be viewed as a generalization of the Berlekamp-Massey algorithm to algebraic geometric codes defined on a curve of non-zero genus.
Another example of an iterative decoding algorithm has been disclosed by M. O'Sullivan in the article “A Generalization of the Berlekamp-Massey-Sakata Algorithm” (preprint 2001).
For any received word r, the set of error locating polynomials defined above associated with the transmission errors affecting that word is termed is termed a “Gröbner ideal”. It is possible to generate this Gröbner ideal by means of a finite set of polynomials which constitutes what is known as a “Gröbner basis” of the ideal. The O'Sullivan algorithm which has just been cited produces such a Gröbner basis from a matrix S* obtained by “extending” the matrix S, that is to say by calculating the value of certain elements S*ij, for j greater than w(i). This extension is possible each time the number of errors in the received word is less than or equal to (n−k+1−g)/2.
When the number of errors in the received word is less than or equal to (n−k+1−g)/2, it is in general necessary to know further elements of the syndromes matrix than those obtained from the components of the error syndromes vector s, to be able to correct those errors. It is fortunately possible to calculate these elements of “unknown” value by a method comprising a certain number of “majority decisions”, for example by using the “Feng-Rao algorithm”. This algorithm, the essential object of which is to extend the matrix S by providing the value of at most g “unknown” elements per line of S, is disclosed in the article by G.-L. Feng and T. R. N. Rao entitled “Decoding Algebraic-Geometric Codes up to the Designed Minimum Distance” (IEEE Trans. Inform. Theory, vol. 39, No. 1, January 1993); more details about this algorithm are given below.
It will be noted that the calculation of the elements of unknown value may either be performed prior to the decoding algorithm (this is normally the case for the Sakata algorithm mentioned above, which also uses the “extended” matrix S*), or be integrated with the steps of the decoding algorithm (this is the case for the O'Sullivan algorithm).
Thus let S* be the syndromes matrix so “extended”, and consider the system of linear equations
                                                        ∑                              i                =                1                            μ                        ⁢                                          l                i                            ⁢                              S                ij                *                                              =          0                ,                              for            ⁢                                                  ⁢            j                    =          1                ,        2        ,        …        ⁢                                  ,                              w            *                    ⁡                      (            μ            )                          ,                            (                  1          *                )            in which the unknown values li are to be found in the same alphabet as the symbols of the codewords, and in which u is such that for j=1, . . . , w*(μ), S*μj is known, either directly from the components of the syndromes vector, or indirectly by using an algorithm for calculating the unknown matrix elements (the similarity will of course have been noted between that equation (1*) and the equation (1) used in the “basic” algorithm; in general, it is found that w*(λ)>w(λ), in other words, the “boundary” is pushed further back when S and S* are compared).
For any non-trivial solution of the system (1*), that is to say a solution where the coefficients li are not all zero, the polynomial
                              Λ          ⁡                      (                          x              ,              y                        )                          ≡                              ∑                          i              =              1                        μ                    ⁢                                    l              i                        ⁢                                          h                i                            ⁡                              (                                  x                  ,                  y                                )                                                                        (                  2          *                )            is an error locating polynomial.
Now consider the case in which the number of errors is greater than (n−k+1−g)/2. The “majority decision” algorithms, such as the Feng-Rao algorithm, are not then always capable of providing an adequate value for the unknown elements of S, and consequently, a decoding algorithm using the values of those matrix elements cannot be used.
In general terms, the principle of iterative decoding (such as in the algorithms cited by way of example above) is the following. Each iteration, except for the last, of an iterative decoding algorithm uses a sub-matrix of the matrix S*. Such an iteration then provides a certain number of polynomials which, like the true error locating polynomials of equation (2*), are constituted by a linear combination of monomials hi, but for which it is not certain that the common roots include all the erroneous positions of the received word r. In other words, the status of error locating polynomials is guaranteed for all the polynomials obtained by the algorithm only provided that the number of errors is less than or equal to (n−k+1−g)/2 and that the iterations are continued right to the end, which requires the use of the elements of matrix S*ij where j>w(i).
It is for this reason that, according to the state of the art, an error is considered impossible to correct for which those elements of the matrix S* are required but cannot be calculated.