This application relates in general to error correction code, and in specific to error correction code that detects wire failure, particularly stuck-at-fault wire failures.
In the prior art, data that is transmitted over wires frequently incurs errors, i.e. a binary 1 is distorted to appear as a binary 0 or vice versa. The errors may be single bit errors, where one bit in the data stream is corrupted, or double bit errors, where two bits in the data stream are corrupted. Note that typically, the data is transmitted over a set of wires, rather than a single wire, however errors could occur in both single wire and multiple wire transmission systems. Furthermore, the data is much longer than the number of wires, and thus is sent over multiple cycles, e.g. 16 wires are cycled 4 times to send 64 bits. Therefore, errors may occur over multiple cycles. In some systems, the data is packetized, which means that the data is delivered in specifically sized data packets. Thus, errors could occur in different data packets. One of the typical causes of errors is wire failures. Shorts or breaks in the wire can cause faulty signals to be sent down the wire. These failures are classified as one of two types. The first type is where the wire is either stuck-at-one or stuck-at-zero. Thus, whatever the input, the wire relays only a one (for stuck-at-one) or zero (for stuck-at-zero), and does not switch the signal. The second type is a malicious failure. This type of failure is where the output of the wire is switching, regardless of the input. For example, where the input is a zero, the wire output could be either a one or zero and where the input is a one, the wire output could be either a one or zero. In other words, the behavior of the wire is unpredictable. Furthermore, the error may be masked, because the wire failure may deliver the correct result.
To detect such errors, ECC code is transmitted along with the data. Cyclic codes are a type of ECC code that possess the capability to detect wire failures. Cyclic codes are an important class of codes. The generator/parity matrix for these codes are formed by the cyclic shift of a row. There are efficient cyclic codes for detection/correction of multiple random errors, byte errors and burst errors. Cyclic codes are discussed further in xe2x80x9cError Control Coding for Computer Systemsxe2x80x9d by T. R. N., Rao and E. Fujiwara, Prentice Hall, Englewood Cliffs, N.J. 07632, ISBN 0-13-28395-9, which is hereby incorporated by reference. Cyclic codes are directed at detecting malicious failures, and thus assume failures are malicious failures. Since the cyclic codes target for the latter, they require more bits than checking for stuck-at-fault failures, and the required number of bits may be more than a designer may have to spare. For example, assume a data message comprises 32 bytes, which is 256 bits. To allow for single bit error and double bit error detection 10 extra bits are required, the single error correction requires 9 bits, since 29 is the smallest power of 2 that is greater than (256+9). The 10th bit is used for detecting double bit errors, for a total of 266 bits. Thus, 10 bits are required for doing single error correction and double error detection. If these 266 bits are going to be transported across 10 wires, then 6 wires would carry 27 bits, and 4 wires would carry 26 bits. Thus, a wire failure could affect up to 27 bits. To detect malicious wire failures, 27 additional bits are required, see Theorem 3.7 from the book by Rao and Fujiwara, wherein a cyclic code generated by g(x) (of degree xcex3) can detect any burst of length xcex3 or less. This will detect a wire failure plus any burst of length 27 or less extending over two consecutive wires, for a total of 293 bits. Thus, a total of 37 bits are required for error detection. This is a large amount of overhead which will consume a great proportion of system resources for transmission.
While 37 bits represents only 13% of 293 bits, a higher percentage of overhead may result from the extra bits needed for error detection, particularly when the data is transmitted in blocks or packets. For example, suppose a block of data comprises 7 cycles of data across 10 wires, for a total of 70 bits per block. Then 256 bits would require four blocks (3.6 blocks rounded up), while 293 bits would require five blocks (4.1 blocks rounded up). Thus, error detection would require an extra block or 20%.
In addition to adding overhead for data transmission, cyclic codes are more complex to implement. Decoding on the receiving end is complicated as many different mechanisms exist for implementing cyclic codes.
Therefore, there is a need in the art for error detection mechanism that detects wire stuck -at -faults which does not require significant overhead and is easy to implement.
These and other objects, features and technical advantages are achieved by a system and method that uses an error detection mechanism to detect wire stuck-at faults. This mechanism can be used to augment an existing ECC code with wire stuck-at fault detection capability. For example, existing ECC code may detect random single errors or double errors (SEC-DED) in data transmission, whereas the inventive mechanism detects a wire failure which errors in the data transmitted on the failed wire. The inventive mechanism determines the number of 1""s (or 0""s) in a message, including the existing ECC code for the message, and appends the message with this information. This count is itself protected by the same ECC code that is used for the message. When the message is decoded at the receiving location, any stuck-at-fault wire failures would be detected from comparing the appending information with the contents of the message.
In addition to detecting wire stuck-at faults, the mechanism may also detect any number of multiple errors if the number of 0 to 1 transitions does not equal the number of 1 to 0 transitions in the data portion after decode. The advantages of the inventive mechanism over the prior art cyclic codes is the lower number of required check bits, a relatively simpler implementation, and the capability to trade-off wire failure detection for the number of additional checkbits required. The inventive mechanism is particularly useful in the detection of multiple errors occurring when the code word is transmitted over multiple cycles with a wire failure.
The inventive mechanism will detect stuck-at-fault failures, and most malicious wire failures. The inventive mechanism will not detect all malicious wire failures, particularly those where the number of 0 to 1 transitions equals the number of 1 to 0 transitions after ECC decode. Thus, the invention is primarily intended to detect predictable failures, e.g. stuck-at-faults, where a wire is stuck at 0 or 1, which causes a change in the number of 1""s or 0""s in the data transmission. The inventive mechanism can be scaled according to the number of wires used in data transmission.
The foregoing has outlined rather broadly the features and technical advantages of the present invention in order that the detailed description of the invention that follows may be better understood. Additional features and advantages of the invention will be described hereinafter which form the subject of the claims of the invention. It should be appreciated by those skilled in the art that the conception and specific embodiment disclosed may be readily utilized as a basis for modifying or designing other structures for carrying out the same purposes of the present invention. It should also be realized by those skilled in the art that such equivalent constructions do not depart from the spirit and scope of the invention as set forth in the appended claims.