This invention is in the field of data communication, and is more specifically directed to error correction methods in the receipt of such communications.
Recent advances in the electronics field have now made high-speed digital data communications prevalent in many types of applications and uses. Digital communication techniques are now used for communication of audio signals for telephony, with video telephony now becoming available in some locations. Digital communication among computers is also prevalent, particularly with the advent of the Internet; of course, computer-to-computer networking by way of dedicated connections (e.g., local-area networks) and also by way of dial-up connections has also become prevalent in recent years.
Of course, the quality of communications carried out in these ways depends upon the accuracy with which the received signals match the transmitted signals. Some types of communications, such as audio communications, can withstand bit loss to a relatively large degree. However, the communication of digital data, especially of executable programs, requires exact fidelity in order to be at all useful. Accordingly, various techniques for the detection and correction of errors in communicated digital bit streams have been developed. Indeed, error correction techniques have effectively enabled digital communications to be carried out over available communication facilities, such as existing telephone lines, despite the error rates inherent in high-frequency communication over these facilities.
Error correction may also be used in applications other than the communication of data and other signals over networks. For example, the retrieval of stored data by a computer from its own magnetic storage devices also typically utilizes error correction techniques to ensure exact fidelity of the retrieved data; such fidelity is, of course, essential in the reliable operation of the computer system from executable program code stored in its mass storage devices. Digital entertainment equipment, such as compact disc players, digital audio tape recorders and players, and the like also now typically utilize error correction techniques to provide high fidelity output.
An important class of error detection and error correction techniques is referred to as Reed-Solomon coding, and was originally described in Reed and Solomon, "Polynomial Codes over Certain Finite Fields", J. Soc. for Industrial and Applied Mathematics, Vol. 8 (SIAM, 1960), pp. 300-304. Reed-Solomon coding uses finite-field arithmetic, such as Galois field arithmetic, to map blocks of a communication into larger blocks. In effect, each coded block corresponds to an over-specified polynomial based upon the input block. Considering a message as made up of k m-bit elements, a polynomial of degree n-1 may be determined as having n coefficients; with n greater than k (i.e., the polynomial is overspecified), not all of the n coefficients need be valid in order to fully and accurately recover the message. According to Reed-Solomon coding, the number t of errors that may be corrected is determined by the relationship between n and k, according to ##EQU1##
Reed-Solomon encoding is used to generate the encoded message in such a manner that, upon decoding of the received encoded message, the number and location of any errors in the received message may be determined. Conventional Reed-Solomon encoder and decoder functions are generally implemented, in microprocessor-based architectures, as dedicated hardware units that are not in the datapath of the central processing unit (CPU) of the system, as CPU functionality has not heretofore been extended to include these functions.
In this regard, FIG. 1 illustrates one example of an architecture for a conventional Reed-Solomon encoder, for the example where each symbol is eight bits, or one byte, in size (i.e., m=8), where Galois field arithmetic is used such that the size of the Galois field is 2.sup.8, and where the maximum codeword length is 2.sup.8 -1, or 255 symbols. Of course, other architectures may be used to derive the encoded codeword for the same message and checksum parameters, or of course for other symbol sizes, checksum lengths, or maximum codeword lengths. In the example of FIG. 1, sixteen check symbols are generated for each codeword, and as such eight errors per codeword may be corrected. According to conventional Reed-Solomon encoding, the k message bytes in the codeword (M.sub.k-1, M.sub.k-2, . . . , M.sub.0) are used to generate the check symbols (C.sub.15, C.sub.14, . . . , C.sub.0). The check symbols C are the coefficients of a polynomial C(x) EQU C(x)=C.sub.15 x.sup.15 +C.sub.14 x.sup.14 + . . . +C.sub.0
which is the remainder of the division of a message polynomial M(x), having the message bytes as coefficients: EQU M(x)=M.sub.k-1 x.sup.K-1 +M.sub.k-2 x.sup.k-2 + . . . +M.sub.0
where the message polynomial M(x) is multiplied by the term x.sup.2 t, and divided by a divisor referred to as generator polynomial G(x): EQU G(x)=(x-a.sup.0)(x-a.sup.1)(x-a.sup.2) . . . (x-a.sup.15)=x.sup.16 +G.sub.15 x.sup.15 +G.sub.14 x.sup.14 + . . . +G.sub.0
where each value is a root of the binary primitive polynomial x.sup.8+ x.sup.4+ x.sup.3+ x.sup.2+ 1. The exemplary architecture of FIG. 1 includes sixteen eight-bit shift register latches 6.sub.15 through 6.sub.0, which will contain the remainder values from the polynomial division, and thus will present the checksum coefficients C.sub.15 through C.sub.0, respectively. An eight-bit exclusive-OR function 8.sub.15 through 8.sub.1 is provided between each pair of shift register latches 6 to effect Galois field addition, with XOR function 8.sub.15 located between latches 6.sub.15 and 6.sub.14, and so on. The feedback path produced by exclusive-OR function 2, which receives both the input symbol and the output of the last latch 6.sub.15, presents the quotient for each division step. This quotient is broadcast to sixteen constant Galois field multipliers 4.sub.15 through 4.sub.0, which multiply the quotient by respective ones of the coefficients G.sub.15 through G.sub.0. In operation, the first k symbols contain the message itself, and are output directly as the leading portion of the codeword. Each of these message symbols enters the encoder architecture of FIG. 1 on lines IN, and is applied to the division operation carried out by this encoder. Upon completion of the operations of the architecture of FIG. 1 upon these message bytes, the remainder values retained in shift register latches 6.sub.15 through 6.sub.0 correspond to the checksum symbols C.sub.15 through C.sub.0, and are appended to the encoded codeword after the k message symbols.
The encoded codewords are then communicated in a digital bitstream, after the appropriate formatting. For communications over telephone facilities, of course, the codewords may be communicated either digitally or converted to analog signals; digital network or intracomputer communications will, of course, maintain the codewords in their digital format. Regardless of the communications medium, errors may occur in the communicated signals, and will be reflected in the received bitstream as opposite binary states from those in the input bitstream, prior to the encoding process of FIG. 1. These errors are sought to be corrected in the decoding process, as will now be described in a general manner relative to FIG. 2.
An example of the decoding of Reed-Solomon encoded codewords, generated for example by the architecture of FIG. 1, is conventionally carried out in the manner now to be described relative to decoder 10 illustrated in FIG. 2. Decoder 10 receives an input bitstream of codeword symbols, which is considered, for a single codeword, as received polynomial r(x) in FIG. 2. Received polynomial r(x) is applied to syndrome accumulator 12, which generates a syndrome polynomial s(x) of the form:
s(x)=s.sub.i-1 x.sup.i-1 +s.sub.i-2 x.sup.i-2 + . . . +s.sub.1 x+s.sub.0
Syndrome polynomial s(x) is indicative of whether errors were introduced into the communicated signals over the communication facility. If s(x)=0, no errors were present, but if s(x) is non-zero, one or more errors are present in the codeword under analysis. Syndrome polynomial s(x), in the form of a sequence of coefficients, is then forwarded to Euclidean array function 15.
Euclidean array function 15 generates two polynomials .LAMBDA.(x) and .OMEGA.(x) based upon the syndrome polynomial s(x) received from syndrome accumulator 12. The degree .nu. of polynomial .LAMBDA.(x) indicates the number of errors in the codeword, and is forwarded to Chien search function 16 for additional analysis. Polynomial .OMEGA.(x) is also generated by Euclidean array function 15, and is forwarded to Forney function 18 which uses polynomial .OMEGA.(x) to evaluate the error in the received bitstream r(x). The roots of error locator polynomial .LAMBDA.(x) are determined by Chien search function 16, and are expressed as zeroes polynomial X(x) from which Forney function 18 determines the error magnitude polynomial M(x). Chien search function 16 also forwards zeroes polynomial X(x) to error position circuit 17 which generates error position polynomial P(x) therefrom. Error magnitude polynomial M(x) and error position polynomial P(x) are forwarded to input ring buffer 19 as an indication of the magnitude and position, respectively, of the errored symbols in the bitstream r(x), which is also forwarded to input ring buffer 19. Input ring buffer 19 then generates the output bitstream i'(x) by effectively subtracting the designated error magnitude from bitstream r(x) at the identified positions of the error, so that output bitstream i'(x) faithfully represents input bitstream r(x).
The use of programmable devices such as microprocessors and digital signal processors (DSPs), such as the TMS320c6x family of DSPs manufactured and sold by Texas Instruments Incorporated, is generally favored in modern data processing and communications applications, making it is desirable to execute such operations as syndrome accumulation and Chien search in such a programmable DSP or microprocessor. However, it is cumbersome for conventional programmable logic devices to execute finite field arithmetic operations, such as the Galois field multiplications, logarithms, and other operations described hereinabove.
Referring now to FIGS. 3a and 3b, an example of a conventional syndrome accumulation software program, executable by a DSP or other programmable microprocessor, will now be described. This conventional syndrome accumulation method corresponds to syndrome accumulator 12 in the architecture of decoder 10 of FIG. 2, even though implemented by way of software rather than dedicated hardware. It has been observed, in connection with the present invention, that this conventional syndrome accumulation process typically occupies up to as much as half of the overall computational time involved in Reed-Solomon decoding.
The conventional syndrome accumulation process shown in FIG. 3a begins with process 20, in which the DSP initializes index i to 0 and index j to 1. Index i is an outer loop index, while index j is the index for an inner loop, as will become apparent from the following description. Following initialization process 20, the DSP executes process 22 to retrieve, from memory, a finite field character .alpha..sup.i of the particular alphabet used in the Reed-Solomon decoding process. In this specific example, Galois field operations are used in the decoding operation, as is conventional for Reed-Solomon decoding. In this first pass (index i=0), the DSP sets the value of variable .beta. to the first Galois field character .alpha..sup.0 (i.e., .beta.=1). Process 24 is then next performed, in which the first input byte R[0] in the received sequence is received, and a sum variable s.sub.i associated with index i is then initialized to the value of input byte R[0].
Control then passes to process 26, in which the DSP performs a Galois field multiplication of the current value of sum s.sub.i with the value of variable .beta.. This Galois field multiplication and other Galois field arithmetic operations are defined over a finite field of characters (i.e., the Galois field "alphabet"), the size and members of which depend upon the symbol size used in the coding. The Galois field multiplication of process 26 requires a significant amount of computing resources, as will now be described relative to FIG. 3b which illustrates, in more detail, a typical conventional software implementation of process 26.
For purposes of computational efficiency, typical software approaches to Galois field multiplication involve the use of look-up tables in memory, particularly in cases where the memory requirements for such tables is relatively modest and where performance of the algorithm is a significant factor. In the conventional example of FIG. 3b, Galois field multiplication process 26 is performed by adding the logarithms of the multiplicands, as this approach, particularly in connection with finite field arithmetic, is much more efficiently implemented than would be an explicit multiplication. The base of the logarithm (and thus of the eventual exponentiation) can be any primitive element of the Galois field alphabet, for example .alpha.=2. In the conventional example of FIG. 3b, process 26 begins with process 34, in which the DSP accesses a logarithmic look-up table to determine the Galois field logarithm of the value of sum s.sub.i at this time; similarly, process 36 accesses a look-up table (generally the same look-up table as used in process 34) to determine the Galois field logarithm of the value of variable .beta.. Process 38 then performs a Galois field modulo (P-1) addition of the results of processes 34, 36, where P corresponds to the number of characters in the Galois field alphabet; for the example of eight-bit symbol sizes, 256 characters will be present in the corresponding Galois field alphabet (i.e., P=256). The result of addition process 38 is value LOGSUM. Following Galois field addition process 38, the current values of sum s.sub.i and character .beta. are tested against zero in decision 39; if either is zero, process 40 sets the result MPY of the multiplication of process 26 to zero. If both of sum s.sub.i and character .beta. are non-zero (decision 39 is YES), the result MPY of process 26 is established by applying the value LOGSUM to a Galois field exponential look-up table to return a Galois field exponential therefrom (inverting the logarithms determined in process 34, 36), producing result MPY as the product of sum s.sub.i and character .beta..
Control then returns to process 28, shown in FIG. 3a, in which a Galois field addition of the result MPY of process 26 with the current input byte R[j] corresponding to the current value of index j is performed. The result of the addition of process 28 is stored as the current value of sum s.sub.i. Following process 28, the values of the indices i, j are tested against their corresponding limits n (the number of bytes in a message frame) and 2t (twice the number t of errors that may be corrected by the Reed-Solomon process). If index j is not yet equal to limit n (decision 29 is NO), index j is incremented in process 30 and control is passed to process 26 to repeat the Galois field multiplication and addition with the next input byte (process 28). Upon limit n being reached by index j (decision 29 is YES), the value of index i is tested against its limit 2t in decision 31. If index i does not yet equal limit 2t (decision 31 is NO), process 32 is performed to increment index i and to reset index j to 1, and control passes back to process 22 for retrieval of the next character .alpha..sup.i, from which the process is repeated. Upon index i reaching its limit 2t (decision 31 is YES), the syndrome accumulation process is complete.
According to this conventional software approach to syndrome accumulation, three table look-up operations (i.e., processes 34, 36, 42) are required in each Galois field multiplication process 26. One can readily determine that, according to this conventional approach, 2nt instances of Galois field multiplication processes 26 and additions 28 are carried out, considering that index j varies from 1 to n-1 and that index i varies from 0 to t (and the zeroth value of .alpha..sup.i renders a trivial result). For the Reed-Solomon case where .alpha.=2, n=204 (bytes/frame), t=8 (eight errors correctable), and P=256 (256-character Galois field), 3,264 such multiplications and additions are performed for each syndrome accumulation calculation. While a Galois field addition may be implemented as a simple bit-wise exclusive OR, the Galois field multiplication dominates the computational time. For example, each of these Galois field multiplication operations, when implemented by way of look-up tables in a TMS320c6x DSP architecture, requires on the order of twelve machine cycles for execution. Considering the number of passes through the inner loop, these three table look-up operations per multiplication thus results in 6nt table look-up operations in total. The memory required to implement each of the logarithm and exponentiation tables is 256x8 bits, with three tables required (to permit the two processes 34, 36 to be done in parallel).
Alternatively, one could implement one large table to return a multiplication result directly from the two eight-bit inputs, reducing the number of machine cycles required for the multiplication from twelve to one, in the TMS320c6x DSP architecture. However, the size of this table would necessarily be P.sup.2 x .left brkt-top.log.sub.2 P.right brkt-top. bits (.left brkt-top. .right brkt-top. representing the ceiling function, where .left brkt-top.x.right brkt-top. returns the smallest integer y such that y.gtoreq.x) or, in this example using Galois field 256 arithmetic, 65536 bytes. A look-up table of this size is prohibitively large for most implementations.
Similar computational time complexity is also present in connection with the Chien search process of Reed-Solomon decoding. As known in the art, Chien search involves an exhaustive search, over the entire Galois field (having P members) for zeros of a polynomial of degree D (the maximum value of which is the number t of correctable errors). A conventional approach to this search is carried out using Horner's algorithm for root determination, iterating sequentially over the possible zeros.
FIG. 4 is a flow diagram illustrating a conventional software approach to the Chien search process, as may be implemented in a program executed by a conventional DSP and using the iterated Horner's algorithm method. The operation of FIG. 4 corresponds to Chien search function 16 in decoder 10 of the architecture of FIG. 2. This Chien search software approach occupies approximately 30% of the computational time of the entire Reed-Solomon decoding process, in conventional applications.
This conventional Chien search software algorithm begins with process 44, in which outer loop index i (corresponding to the members of the Galois field alphabet) is initialized to 1, inner loop index j is initialized to the value D-1, where D corresponds to the degree of polynomial A from the Euclidean array process (as shown in FIG. 2), and where index k corresponding to the number of roots found is initialized to zero. Process 46 is then performed, beginning the outer loop of the algorithm, to set the sum .upsilon. to the highest degree term in the input polynomial .LAMBDA.(x) from the Euclidean array, namely .LAMBDA..sub.D.
The inner loop of this conventional Chien search operation is dominated by process 48, in which a Galois field multiplication of the current value of sum .upsilon. with index i (i.e., the corresponding Galois field alphabet member) is performed. This Galois field multiplication is performed by way of the addition of finite field logarithms of the multiplicands, followed by the exponentiation of the sum; as described above relative to FIG. 3b, such a Galois field multiplication involves three table look-up operations plus a Galois field addition. In this case, as described above, it is not feasible from a memory requirement standpoint to carry out the Galois field multiplication using a single table look-up operation.
As a result of process 48, a new value of sum .upsilon. is produced; process 50 then adds (in the Galois field) the next term .LAMBDA..sub.j of the input polynomial .LAMBDA.(x) to the current value of sum .upsilon. to produce a new value of sum .upsilon.. Decision 51 determines if the inner loop is complete; if not, index j is decremented in process 52 and control passes back to process 48. Upon decision 51 determining that the inner loop is complete (decision 51 is YES), the value .upsilon. represents a complete evaluation of the input polynomial .LAMBDA.(x), and is tested in decision 53 to determine if a root of the input polynomial .LAMBDA.(x) has been found in association with the current Galois field character indicated by index i. If so (decision 53 is YES), a memory array location zero(k) is set to the current value of index i in process 54, and result index k is incremented in process 55. Following process 55, or if no root was found (decision 53 is NO), outer loop index i is tested in decision 57; if remaining passes are required (decision 57 is NO), index i is decremented in process 56, index j is re-initialized to D-1 in process 58, and control passes back to process 46. Upon decision 57 returning a YES result in response to index i=P (P being the number of characters in the Galois field alphabet), the Chien search process is complete.
As noted above, in this conventional Chien search process, the Galois field multiplication operation of process 48 dominates the computational time of the method, even when performed by way of table look-up operations. According to typical architectures, on the order of twelve machine cycles are required for such an operation. Of course, while this operation could be reduced to a single machine cycle by generation of a large Galois field multiplication look-up table, the memory cost of such a table is prohibitive.
By way of further background, because of this complexity and operational cost, custom hardware devices are often used in the realization of Reed-Solomon decoders, particularly for the syndrome accumulation and Chien search operations described above. Such custom hardware solutions are, of course, limited in their flexibility to operate upon encoded communications according to varied standards and techniques, and of course require the design effort and manufacturing lead time necessary for their production.