1. Field of the Invention
This invention relates to computer system reliability and availability and, more particularly, to mapping the network interconnect into field replaceable units.
2. Description of the Related Art
Computer systems are typically available in a range of configurations which may afford a user varying degrees of reliability, availability and serviceability (RAS). In some systems, reliability may be paramount. Thus, a reliable system may include features designed to prevent failures. In other systems, availability may be important and so systems may be designed to have significant fail-over capabilities in the event of a failure. Either of these types of systems may include built-in redundancies of critical components. In addition, systems may be designed with serviceability in mind. Such systems may allow fast system recovery during system failures due to component accessibility. In critical systems, such as high-end servers and some multiple processor and distributed processing systems, a combination of the above features may produce the desired RAS level.
One way of achieving high reliability and availability in systems is through the use of error codes. Error codes are commonly used in electronic systems to detect and correct errors such as transmission errors or storage errors. For example, error codes may be used to detect and correct errors in information transmitted via a communication link within a computer system. Error codes may additionally be used to detect and correct errors associated with information stored in the memory or mass storage devices of computer systems. One common use of error codes is to detect and correct errors in information transmitted on a bus within a computer system. In such systems, error correction bits, or check bits, may be generated for data prior to its transfer or storage. The check bits may then be transmitted or stored with the data. When the data is received or retrieved, the check bits may be used to detect and/or correct errors within the data. The use of error codes within a computer system may increase the data integrity of that system by detecting errors as soon as they occur. Similarly, the use of error codes may improve system availability by allowing the system to continue to function despite one or more failures.