Modern computer systems use various interconnection mechanisms to allow communications between various components of the computer system. In a multi-computer system, central processing units or the interconnect chipsets may communicate with one another through various defined transactions such as a fetch request, a data return, and a snoop request, for example. Transactions may be sent in each interconnect using a protocol format defined by the specification for that interconnect. Such a transaction may include one or more packets. Different transactions may need different packet lengths. For example, a number of packets required to send a fetch request may be less than a number of packets required to send a cache line data return. A packet is the basic unit of data transmission and includes a number of cycles of data transfer in the interconnect structure.
Most interconnect structures provide a form of error detection and/or correction. An error correcting code (ECC) and associated circuit gives the computer system the ability to tolerate various anticipated errors and to provide a high degree of reliability during data transmission. One approach to implementing an ECC is to provide the ECC at the packet level such that each packet is independently protected by the underlying ECC for anticipated failures.
Error correction codes have been developed that both detect and correct certain errors. One well known class of ECC algorithm is the “Hamming codes,” which are widely used for error detection and correction in digital communications data storage systems. The SEC-DED Hamming code is capable of detecting double bit errors and correcting single bit errors. A detailed description of the Hamming codes is found in Shu Lin et al., “Error Control Coding, Fundamentals and Applications,” Chapter 3 (1982). Another well known ECC algorithm is the “Reed-Solomon code” widely used for error correction in the compact disk industry. A detailed description of this ECC algorithm is found in Hove et al., “Error Correction and Concealment in the Compact Disk System,” Philips Technical Review, Vol. 40, No. 6, pp. 166–172 (1980). The Reed-Solomon code is able to correct multiple errors per word. Other conventional ECC algorithms include the b-adjacent error correction code described in D. C. Bossen, “B-Adjacent Error Correction,” IBM J. Res. Develop., pp. 402–408 (July 1970), and the odd weight column codes described in M. Y. Hsiao, “A Class of Optimal Minimal Odd Weight Column SEC-DED Codes,” IBM J. Res. Develop., pp. 395–400 (July 1970). The Hsiao codes, like the Hamming codes, are capable of detecting double bit errors and correcting single bit errors. The Hsiao codes use the same number of check bits as the Hamming codes (e.g., 8 check bits for 64 bits of data), but are superior in that hardware implementation is simplified and speed of error detection is improved.
Use of an ECC imposes an overhead on each transaction. The extra overhead required to implement the ECC reduces bandwidth available for data transmission and other functions.