Computer systems include software and hardware. The hardware includes processors, memory, input and output components, and other physical devices. Occasionally, different components of hardware may malfunction. The malfunctioning of a hardware component is an error. Specifically, an error is an unexpected condition, result, signal, or datum in a computer system or network.
Some computer systems provide fault management. Fault management provides a mechanism for detecting errors, determining the cause of the error, and correcting the cause. Specifically, when an error is detected the cause of the error may be determined to prevent future errors of the same type and to ensure that the error is not a symptom of a more serious problem of the computer system. The cause of the error is a fault. In particular, a fault is a problem that is in the hardware that may produce the error.
Fault management may be performed in the operating system of the computer system. In order to provide fault management, the operating system typically requires information about the hardware structure of the computer system. Thus, prior to shipping a new type of computer system, the operating system that is to execute on the hardware of the computer system is programmed with information about the hardware. Then, computer system is shipped with the hardware and operating system.