This invention pertains to systems for detecting and correcting errors in digital data. More specifically, the invention concerns a system and circuit that implement Reed-Solomon codes for detecting and correcting multiple errors in digital data when transmitted between devices of a data processing system.
Devices of data processing systems and of many communication systems intercommunicate by transmitting and/or receiving digital data signals. These signals generally are electrical or electromagnetic representations of "ones" and "zeroes" and are transferred through a data channel between the devices in a serial or parallel fashion. In data processing systems, the electrical subsystems between the origin of the data and the end user typically constitute the data channel. During transfer of the data, errors sometimes occur. These errors mainly result from inherent or extraneous noise in the data channel. With respect to a magnetic storage device used in a data processing system, such errors can result from defects oftentimes present in the magnetic storage medium, such as, for example, defects in the metal oxide coating. Such defects present error patterns which may be "bursty" in nature; that is, several contiguous errors may occur when retrieving data from a particular location on the storage medium. To render the storage device useful, these error patterns must be corrected, if at all possible. If uncorrectable, the data is permanently lost.
Correction of erroneous data, upon retrieval, is attained by mathematically reconstituting correct code words. To construct a code word, a typical encoder produces parity information, or a checksum word, by sequentially encoding (i.e. mathematically transforming) a fixed number of m-bit symbols, or characters, derived from the original data. Usually, a few bits define a symbol. The checksum word, also comprising a fixed number of m-bit symbols, is then appended to the data symbols thereby to construct the code word. The checksum word, in essence, mathematically characterizes the bit patterns of the original data symbols. Upon receipt or retrieval of that code word, as the case may be, a decoder, using the information contained in the checksum word, examines and manipulates the data bits thereof in a fashion to detect, locate, and/or correct errors occurring therein.
When discussing coding theory, it is common practice to characterize the data words, code words, and checksum words as being a plurality of m-bit binary symbols, each being representative of coefficients of a polynomial in a variable or indeterminent, say "x". The encoder transforms the symbols of each data word to produce symbols of the checksum word, the symbols thereof being elements in the Galois Field (2.sup.m). To facilitate this discussion, we designate the original data word as d(x); the original code word as w(x); the original checksum word as E(x); and the received word as y(x). We define the original code word as being an "(n-1).sup.th " order polynomial w(x)=x.sup.n-k d(x)-E(x), where x.sup.n-k d(x) is a higher order polynomial representing the first "k" high order coefficients of w(x), and E(x) is a lower order polynomial representing the lowest "n-k" order coefficients of w(x). The checksum word E(x) is generated by dividing (i.e. encoding) the data word d(x) by a (n-k).sup.th order generator polynomial g(x). Specifically, E(x) is the "remainder" of x.sup.n-k d(x)/g(x). Thus, the code word w(x) becomes evenly divisible by the generator polynomial g(x) with a remainder of "zero".
A correctly received word y(x) also is evenly divisible by g(x) because w(x), by the aforementioned definition, is evenly divisible by g(x). So, one well known procedure for detecting errors upon receipt or retrieval of a word y(x) is to divide it by g(x). If the remainder of y(x)/g(x) is "zero", then w(x) is presumed to have been correctly received. If the remainder of y(x)/g(x) is "non-zero", then an error has occurred and an error correction routine is invoked.
The nature of the generator polynomial g(x) determines, among other things, the extent and complexity of the error correction routine. The generator polynomial g(x) is characterized by having roots .alpha..sup.i which are selected elements of the Galois Field (2.sup.m). For most m, there is a very large number of generator polynomials g(x). Selecting a proper generator polynomial g(x) for a given application can sometimes be a difficult task.
Error locations and values are respectively calculated from an error location polynomial and an error evaluator polynomial. These polynomials are computed from error syndromes S.sub.i which are conventionally obtained by dividing the entire received word y(x) by factors of the generator polynomial g(x). Once the error syndromes S.sub.i have been determined, several procedures can be used for computing the error locations and values, the more notable techniques being the Berlekamp-Massey or the Berlekamp decode algorithm for finding the error location polynomial and the Chien's search algorithm for finding error locations. These techniques involve error-location-polynomial determination, root finding for determining the positions of the errors, and error value determination for determining the bit-pattern of the errors. Syndrome computation and root finding are potentially the most time consuming as they involve "n" iterative applications of a basic routine, "n" being the number of symbols in the code word w(x). For long code words capable of enabling correction of many errors, some of these routines presently cannot economically be implemented with electrical components so they usually are performed using a general or special purpose logic processor, such as that described in U.S. Pat. No. 4,162,480 issued to Berlekamp. For additional background information on decoding, reference can be made to "Error Correcting Codes" by Peterson and Weldon, MIT Press, second edition (1972); and "Algebraic Coding Theory", Berlekamp, 1968.
Therefore, in designing and constructing error correcting systems, it is desirable to determine which generator polynomial g(x) provides adequate error correcting power, coding efficiency, and decoding efficiency, and yet be economically and practically implemented with a combination of electrical circuit components and a general purpose logic processor. These factors are particularly critical in their application to high speed data transfers among devices of a data processing system, such as between a direct access disk storage device and a host processor. Coding efficiency, or code rate, is the ratio of the number "k" of data symbols to the number "n" of code word symbols and it varies greatly among the possible generator polynomials g(x). A high rate code is desired, but high rates usually are attained only at the expense of error correcting capability. Further, certain generator polynomials g(x) produce code words w(x) that require substantial decoding time thereby making them unsuitable for real time applications, such as during and between the successive transfers of stored code words from successive data sectors of a magnetic disk storage device. There are yet other generator polynomials g(x) that only can be implemented with inordinately complex and expensive circuits. Accordingly, a code providing high efficiency, sufficient correcting capability, rapid decode properties, and a simple and economical circuit implementation is desired.
In some error correction systems, such as described in U.S. Pat. No. 4,117,458, issued to Burghard, an encoder divides the data word d(x) to produce a code word w(x). The Burghard system uses a nine-stage shift register circuit comprising a feedback path connected to the stages thereof for dividing, by its generator polynomial g(x), a data word symbol as the bits thereof are shifted therethrough. It provides a system that can correct a maximum of two-out-of-eight bits in a symbol that are in error with a coding efficiency of 47% (8 data bits/17 code bits). In the Burghard system, an encoder circuit produces a code word w(x) which embodies a checksum word E1(x). Thereafter, a decoder produces a second checksum word E2(x) by re-encoding (i.e. decoding) the data portion of the received word y(x), and then compares the previously transmitted checksum word E1(x) with the newly generated checksum word E2(x). Different results of comparisons of the checksum words E1(x) and E2(x) uniquely identify a limited number of addresses in a read-only-memory that contain a correspondingly limited number of error patterns in the seventeen-bit code word y(x). To correct errors, these comparisons are correlated with table entries for identifying the locations of the error, and then the bits in the locations in the code word are complemented. In the encoding circuitry, the feedback shift register circuit is constructed so that the result of the comparison of E1(x) and E2(x) resides in the nine-stage shift register circuitry at the end of the re-encoding of the code word. Due to its short code word (seventeen bits) and low code rate, it is apparent that the Burghard system would be suboptimal for application in magnetic storage devices.
U.S. Pat. No. 3,668,632 issued to Oldham implements a Reed-Solomon (63,52) over GF(2.sup.6) error correcting code for use in a digital computer storage system. Its code words y(x) are generated by encoding data words d(x) by a generator polynomial g(x) whose roots comprise 6-bit binary symbols that are elements of Galois Field GF(2.sup.6) generated by the primitive polynomial x.sup.6 +x+1. With a coding efficiency of approximately 82.5%, it enables correction of five six-bit symbols per code block of sixty-three 6-bit symbols. To determine whether errors exist in the retrieved code words, Oldham uses checksum calculating circuits to divide the entire word y(x) by each factor of the generator polynomial g(x). Other less time consuming measures, such as provided by this invention, could be used for testing the error status of the retrieved code word y(x). Further, Oldham's invention is based on the premise that most errors in computer storage systems occur singularly and that single errors can rapidly be corrected by conventional single error methods (i.e. the single error value V equals the error syndrome S.sub.0, and the single error location L equals Log [S.sub.1 /S.sub.0 ]). Oldham achieves single error testing by executing a single error correcting routine and re-reading to validate an assumption of a single error. This procedure is relatively time-consuming. Oldham also attempts re-reads and recalculating syndromes many times in hopes of obtaining data having only a single rapidly correctable error. If this procedure is unsuccessful, the Oldham system proceeds to the next time consuming conventional correction routine of simultaneously solving a system of linear equations for identifying error locations and values. Thus, it is evident that the Oldham system would induce substantial delays in the retrieval of information from a storage device in a data processing system.
U.S. Pat. No. 4,142,174 issued to Chen, et al describes a high speed decoding method for Reed-Solomon code words having 8-bit symbols. During decoding, in order to test the code word for the presence of a single error, Chen, in a three-symbol correction system, tests whether S.sub.2 +K S.sub.1 =0 and S.sub.3 +K S.sub.2 =0, where K=S.sub.1 /S.sub.0. If this condition is true, then a single error is presumed to exist and the conventional single error location and value determination routine is performed.
For additional background information relative to the state of the art, reference may be had to "The Technology of Error-Correcting Codes", Elwyn R. Berlekamp, Proceeding of the IEEE, Volume 68, No. 5, May 1980.
In view of the foregoing, a broad objective of this invention is to provide an economical system and circuit for efficiently detecting and correcting digital data that has become corrupted during transmission or storage.
Another objective of this invention is to provide a time-wise efficient decoding method for detecting and correcting multiple errors occurring in binary data transferred between devices of a data processing system.
Another objection of this invention is to provide an error detecting and correcting system for rapidly decoding relatively long code words using a general purpose or special purpose logic processor for computing error syndromes, as opposed to a table look-up technique which is impractical for such code words.
Another objective of this invention is to provide a system which rapidly indicates the error status of a retrieved code word in order for immediately indicating the correctness of a retrieved data word and for bypassing the error correction routine when the retrieved data word is correct or otherwise uncorrectable.
Another objective of this invention is to provide a method and apparatus for rapidly determining whether a single symbol error exists in a received word so that an abbreviated decode algorithm can be used.
Another objective of this invention is to provide an economical circuit having a minimum number of electrical circuit components for rapidly encoding and decoding relatively long code words.
A further objective of this invention is to provide a simple electrical circuit that implements a particular encoding scheme possessing high coding efficiency and high correcting power.
A more specific objective of this invention is to provide a system arrangement including an error detecting and correcting circuit and a controller therefore in the interconnection between a data processing system and a magnetic storage device thereby to enable real time encoding and decoding of data during its transfer between the data processing system and successive data sectors of a magnetic disk storage device.
Other objectives either are stated in the following description or will become evident in view of the description of the succeeding illustrative embodiment.