The present invention relates to an error correction and detection system for a memory or storage device employed in an electronic computer or the like. In more particular, the invention relates to an apparatus for detecting miscorrection in the error correcting and detecting system.
As an attempt to increase the reliability of memory devices, there has been used in practice an error detecting and correcting system in which so-called Hamming codes are adopted for allowing a single bit error to be detected and corrected and a double bit error to be detected. Such code is herein referred to as an SEC-DED code as an abridgement form in capital letters of Single bit Error Correction - Double bit Error Detection, although the same code is sometimes referred to simply as an ECC code, an abridgement of Error Check and Correction. The principle of SEC-DED codes are fully discussed by R. W. Hamming in his article under the title "Error Detecting and Error Correcting Codes" in "The Bell System Technical Journal", Vol. XXVI, No. 2, pp. 147-160, April 1950 and also well known from U.S. Reissued Pat. No. 23601 to R. W. Hamming et al under the title "ERROR-DETECTING AND CORRECTING SYSTEM".
The principle of an SEC-DED code will be briefly reviewed. When the SEC-DED code is composed of n inherent information (data) bits in combination with k' redundant bits for error correction, the following conditions must be satisfied in order to identify the position at which the correction is required (one of n+k' positions) and detect the presence or absence of error. Namely, EQU 2.sup.k' -(n+k'+1).gtoreq.0 (1)
If one additional redundant bit is used for detecting the double bit error, the total number k of the redundant bits is equal to k'+1. Accordingly, the expression (1) can be rewritten as follows: EQU 2.sup.k-1 -(n+k).gtoreq.0 (2)
Hence, it is apparent that the total number k of the redundant bits will amount to 8 bits for data of 64 bits (n=64).
In a practical system operating on the basis of the above principle, write-in data of n bits is supplied to a SEC-DED code generator circuit, at which the write-in data is added to k redundant bits, whereby a write-in SEC-DED code is produced. The number of bits of this code is thus equal to n+k. The coded information containing the write-in data may then be written in a memory device. For the reading-out operation, the information as read out from the memory is a read-out SEC-DED code containing the data. The read-out information code is fed to a SEC-DED circuit in which correction of a single bit error as well as detection of double bit error are made. If a single bit error has been produced within the memory, the SEC-DED circuit detects such a single bit error, to thereby switch on a single bit error detection line for signalling an alarm signal to an operator and at the same time to correct the error bit to a correct value. When a multi-bit error greater than a double bit error, inclusive, has been produced, detection of the error is made in a similar manner and an associated multi-bit error detection line is turned on to signal an alarm. In this way, an n-bit output from the SEC-DED circuit can be utilized as correct data when no error is produced within the memory or the error is a single bit.
On the other hand, when a double or more-bit error has been produced, a read-out data output from the SEC-DED circuit represents false information. For an error of 2 bits, an alarm can be produced with a probability of 100%. For an error of more than 3 bits, an alarm may be produced with a certain degree of reliability. In other words, although the generation of a double bit error can be detected without fail, perfect detection can not be expected for a multi-bit error containing 3 or more bits. Such a situation is also described by Y. Hsiao in his article "A Class of Optimal Minimum Odd-weight-column SEC-DED Codes" of "IBM J. RES. DEVELOP", July 1970, pp. 395-401 (refer in particular to page 398, right column, lines 35 to 42). In such a case, the triple bit error is determined as if a single bit error were produced, whereby a miscorrection is performed. The probability of mistaking the triple bit error for a single bit error is considered generally on the order of 50 to 75%. Alternatively saying, more than a half the miscorrections are processed as corrections. This is of course intolerable in a computer imposed with high reliability and accuracy requirements.
To deal with such an inconvenience, M. Y. Hsiao has introduced in the article referred to above an encoding method according to which the probability of miscorrection of the triple bit error can be reduced to a more reasonable degree.
U.S. Pat. No. 3,436,734 issued to James H. Pomerene et al titled "ERROR CORRECTING AND REPAIRABLE DATA PROCESSING STORAGE SYSTEM" discloses a method of packing the individual bits forming words in different packages with an attempt to reduce the probability of occurrence of a triple bit error.
Further, U.S. Pat. No. 3,582,878 to Douglas C. Bossen et al titled "MULTIPLE RANDOM ERROR CORRECTING SYSTEM", U.S. Pat. No. 3,656,107 to Mu-Yue Hsiao et al entitled "AUTOMATIC DOUBLE ERROR DETECTION AND CORRECTION APPARATUS", as well as U.S. Pat. No. 3,893,071 to Douglas C. Bossen et al entitled "MULTI-LEVEL ERROR CORRECTION SYSTEM FOR HIGH DENSITY MEMORY" disclose systems in which the number k of redundant bits employed usually in SEC-DED circuits is increased (e.g. 9 or more redundant bits for 64 data bits), to thereby allow the double or more bit error not only to be detected but also to be corrected.
Although these known methods are of great significance for enhancing reliability, it is yet impossible to eliminate completely miscorrections.
As an approach to solve the problem described above, Y. Watanabe, one of the inventors of the present application, has proposed in U.S. patent application Ser. No. 836,089 filed Dec. 22, 1977 under the title "ERROR CORRECTION AND DETECTION SYSTEMS" and now U.S. Pat. No. 4,175,692, a system which is capable of detecting with a high certainty that a triple bit error has been miscorrected as a single bit error. As another approach, the same inventor has also proposed in Japanese Laid-Open Patent Application No. 81035/1978 a system which is capable of detecting the fact that a triple bit error has been miscorrected as a single bit error. Briefly reviewing the last mentioned proposal, when a single bit error is detected and corrected in an SEC-DED circuit, the polarity of all data bits after the correction is inverted. Subsequently, redundant bits are produced from the corrected data bits of the inverted state and written again in a memory device. Then, the data bits together with the redundant bits are read out to be supplied to the SED-DED circuit, whereby the repeated detection of error is interpreted as the presence of a triple bit error. The error correcting and detecting system disclosed in Japanese Laid-Open Patent Application recited is completely free of such problem that the triple bit error would be processed as if it were a single bit error, and thus involves no confusion in the succeeding arithmetic processings.
However, the system briefed just above is disadvantageous in that a single bit error produced in the redundant bits may possibly be detected erroneously as the triple bit error. This can be explained by the fact that the redundant bits added to the data bits of the inverted polarity upon re-write operation are newly prepared by a SEC-DED code generator, whereby upon occurrence of the single bit error in the redundant bits, the latter may be prepared and stored in a memory with such polarity that the redundant bits read out from the memory contains the error which is detected as the single bit error by the SEC-DED circuit. In this case, the error is processed in terms of a triple bit error.
In case k redundant bits are added to n data bits, the probability of the redundant bits being processed as the single bit error is given by k/(n+k). Further, the probability of the single bit error occurring in the redundant bits prepared in the memory device with erroneous polarity upon rewriting operation can be given by 1/2.multidot.1/(n+k). For example, in case n=64 bits and k=8 bits, the above probability is equal to 1/2.multidot.8/(64+8)=1/18. In other words, notwithstanding the fact that the single bit error has been detected and corrected normally, signalling of a triple bit error is produced with the probability of 1/18, which means that the single bit error correcting function as imparted will be rendered meaningless. This problem will become more serious in the prior art system provided with the means for enhancing the reliability as recited above, since an erroneous error bit detection will further degrade the single bit error correcting capability, involving a great disadvantage.
Accordingly, an object of the present invention is to provide an error correction and detection system in which a Hamming code including data bits and redundant bits is employed and which is capable of detecting without fail the fact that an error detected in the Hamming code and subjected to correction has been erroneously corrected.
Another object of the invention is to provide an error correction and detection system which is immume to such a problem that a single bit error occurring in the redundant bits of the Hamming code might undesirably be processed as if a triple bit error was produced.
In carrying out the invention, there is provided SEC-DED code generating means which is adapted to produce an SEC-DED code including n data bits added with k redundant bits. Generation of the SEC-DED code is realized in accordance with Hamming's principle by adding the k redundant bits of a minimum value so that the expression (2) recited hereinbefore can be satisfied. The SEC-DED code constituted by the n data bits and the k redundant bits is written in a memory. The SEC-DED code read out from the memory is checked by error correcting and detecting means. This check is carried out also in accordance with Hamming's concept. That is, when an error is detected as a single bit error, the concerned bit is corrected. In the case where a double bit or multi-bit error is detected, a message is produced to the effect that the error of concern is a multi-bit error. In this connection, it should however be noted that there may be effected miscorrection when a triple bit error occurring in reality is erroneously detected as a single bit error and correction is made on the associated bit, as described hereinbefore. With a view to avoiding such an inconvenience, the invention teaches that a retrial process is executed in response to the detection of the single bit error in the aforementioned error correcting and detecting means. To this end, inverting means is provided for inverting the corrected data bits output from the error correcting and detecting means. The inverting means may generally be realized in hardware as an inverter. However, the inversion may alternatively be effected by resorting to the use of a program. In any case, the inverted data bits are supplied to the aforementioned SEC-DED code generating means and again added with k redundant bits to constitute a SEC-DED code which is then written in a memory. The SEC-DED code is subsequently read out from the memory and supplied to the aforementioned error correcting and detecting means. In this manner, the error correcting and detecting means is operative to detect and correct a single bit error contained in the SEC-DED code constituted by the inverted data bits added with the redundant bits and additionally to detect the double bit or multi-bit error. So far as no error is detected at this stage, it is assumed that the preceding detection and correction of the signal bit error has been correctly carried out. However, because the redundant bits to be added to the data bits are newly produced, there is involved a possibility that an error may be present in the redundant bits of the SEC-DED code, for example.
According to a feature of the invention, comparing means is provided for comparing at first the corrected data bits output from the aforesaid error correcting and detecting means with the corrected data bits output for the second time (i.e. upon retrial) therefrom. In this way, the data bits corrected by the correcting and detecting means and available before and after the retrial are compared with each other through the comparing means. The types of data bit comparison include a mutual comparison of the data bits which are inverted relative to the original data bits, a mutual comparison of the non-inverted data bits, comparison of the inverted data bits with the non-inverted one and so forth. For the comparing means, a simple Exclusive OR gate may be used. In the case of the first two aforementioned comparisons in which the data bits of the same state are compared with each other, coincidence between both data bits as found through the comparison will mean that these data bits have been properly corrected regardless of any possible detection of a single bit error in the redundant bits. On the other hand, when correction is correspondingly made on the basis of the false decision that a single bit error is present in the data bits although in reality the other three bits are erroneous, the state only of the single bit is inverted. Accordingly, when the states of all data bits are subsequently inverted, the three actually erroneous data bits will be again written in the memory with the state which is likely to involve error. Thus, no coincidence will result from the comparison between the data bits obtained from the error correcting and detecting means and the data bits of the inverted state available before being written in the memory. Such discrepancy as detected through the comparison thus allows the preceding correction to be detected as an erroneous correction, say a miscorrection.
In the case of the last mentioned comparison where the data bits of opposite states are mutually compared, the procedure to deal with the results of comparison is effected in the reversed manner.
In any case, the miscorrection is detected on the basis of the results obtained from the comparison through the comparing means.
In this way, according to the invention, not only the miscorrection such that the triple bit error occuring actually in the data bits has been processed as if it were a single bit error can be detected, but also the problem that a single bit error occurring in the redundant bits themselves is erroneously detected as a triple bit error can be solved in a satisfactory manner.
The invention is not restricted to the system in which an SEC-DED code is used and in which a single bit error is detected and corrected with double or more bit error being solely detected. More generally, the invention can be applied to systems where a Hamming Code is adopted and error occurring in less than m bits, inclusive, is detected and corrected, while more than (m+1) bits, inclusive, subjected to error are detected. In this case, the aforementioned SEC-DED code generating means is provided by the Hamming code generating means. To this end, a Hamming code is constituted by n original information (or data) bits added with k' redundant bits for error correction, wherein in order to attain the error detecting and correcting function for the bits in number less than m, inclusive, 2.sup.k' combinations of codes made possible by adding k' bits has to be greater than ##EQU1## where C is generally an abridgement of a "combination" and k'+n.sup.C m represent the number of the combinations resulted when m is selected from (k+n). This is explained in detail in the prior art literature mentioned above. Expressed mathematically, the following conditions must be satisfied. ##EQU2##
In order to attain the error detecting capability for the bits in number not smaller than (m+1) in addition to the error detecting and correcting function described just above, there is required additionally a single redundant bit. When the number of all the redundant bits is represented by k, then k=k'+1. Thus, the above expression (3) may be rewritten as follows: ##EQU3##
By the way, it will be noted that the expressions (1) and (2) recited hereinbefore can be derived by substituting 1 for m in the expression (4), because the SEC-DED code is a Hamming code capable of correcting a single bit error and detecting a double bit error, as described hereinbefore.
An important aspect of the invention can be seen in a system using Hamming codes and capable of detecting and correcting error bits in number not greater than m while detecting error bits in number not smaller than (m+1), wherein miscorrection through erroneous detectin of less than m bit error for a more than (m+1) bit error can be detected with an improved reliability.