As the value and use of information continues to increase, individuals and businesses seek additional ways to process and store information. One option available to users is information handling systems. An information handling system generally processes, compiles, stores, and/or communicates information or data for business, personal, or other purposes thereby allowing users to take advantage of the value of the information. Because technology and information handling needs and requirements vary between different users or applications, information handling systems may also vary regarding what information is handled, how the information is handled, how much information is processed, stored, or communicated, and how quickly and efficiently the information may be processed, stored, or communicated. The variations in information handling systems allow for information handling systems to be general or configured for a specific user or specific use such as financial transaction processing, airline reservations, enterprise data storage, or global communications. In addition, information handling systems may include a variety of hardware and software components that may be configured to process, store, and communicate information and may include one or more computer systems, data storage systems, and networking systems.
Information handling systems can include subsystems that monitor the physical health characteristics of system components, such as temperature, voltage, fans, power supplies, and chassis intrusion. Such monitoring subsystems can also monitor hardware-detected faults in the operation of system components. Conventional monitoring subsystems construct and maintain a listing of events. For example, in a server computer system, an event could be added when a particular voltage in the system rises above or falls below specified parameters. As another example, in a server computer system, an event could be added when a particular memory device has a failed parity check. The listing of events is often referred to as a System Event Log. A monitoring subsystem may also maintain a listing of the number and type of monitoring and control features offered by the information handling system. Such a listing is sometimes referred to as Sensor Data Records or SDRs. A software program can read those listings and provide a user with information regarding the type of monitoring that a particular information handling system conducts and the results of that monitoring.
An information handling system can also include indicators that are driven by the data maintained in the System Event Log. For example, the front face of a computer system can include a fault Light Emitting Diode (LED) that is turned on when the System Event Log includes an error. As another option, the front face of a computer system can include a Liquid Crystal Display (LCD) that provides more extensive information about particular errors recorded in the System Event Log. Some systems may contain both an LED and an LCD to allow both general and specific communication of fault status. An indicator can be inaccurate if it either does not indicate an error that is currently present in the system (a false negative) or does indicate an error that is not currently present in the system (a false positive).
An information handling system can contain removable components. For example, a computer system might contain memory modules connected to sockets that can be removed and replaced with different memory modules. Such components are sometimes referred to as Field Replacable Units or FRUs. Other examples of FRUs are processors and motherboards. An FRU may be removed or replaced for several reasons: to fix an error, to upgrade a capability, or to reduce power consumption.
Some FRUs are designed to allow replacement only when the information handling system is not functioning. In other words, as one example the system is turned off, a current FRU is removed, and a new FRU is connected. As another example, the system is turned off and the current FRU is removed, but no new FRU is connected. When the information handling system is turned back on, the new FRU can communicate with the other components of the system. Removing such an FRU while the system is functioning can result in errors.
Some FRUs are designed to allow replacement when the information handling system is functioning without generating errors. Such FRUs are often referred to as hot-pluggable. Hot-pluggable FRUs are connected to the rest of the system such that the system as a whole recognizes the removal of the FRU and configures itself to operate without whatever functionality that FRU provided. Hot pluggable FRUs are also connected to the rest of the system such that the system as a whole recognizes the addition of an FRU and configures itself to operate with whatever functionality that FRU now provides.
FRUs can include a unique identifier, such as a device serial number, that can be communicated to a system in which that FRU is resident. The identifier may be unique only with respect to a particular type of FRU. For example, a memory module can have a serial number that is not shared by any other memory module, but is shared by a processor.
The System Error Log can contain entries for an FRU that is removed either while the system is turned off or while the system is operating. If an FRU that had been the source of an error event is removed and replaced, it is important that the system error log accurately indicate the events associated with the current FRU rather than its predecessor.