When a computer system fails or crashes, valuable computing time can be lost while awaiting repair or replacement. Unfortunately, it may take an extended time to identify the faulty component. This is particularly true for a large computer system, especially one in which a computing partition spans components in different chassis. A technician may need to analyze error data on many of the different components.
Typically, errors are detected and logged by software running on the operating system of the affected computer. This additional load on the operating system is intrusive on performance-critical systems. It is undesirable to be slowing computer performance simply in order to recover more quickly from a rare hardware failure.