1. Field of the Invention
The embodiments of the invention generally relate to computer systems, and, more particularly, to identifying defective components in a computer system.
2. Description of the Related Art
A computer system may be comprised of multiple similar or identical hardware units providing the same type of resources. For example, such hardware units may comprise memory cards, multi-chip modules, input/output cards with multiple ports, etc. For granularity and other reasons, those units may not provide their entire physical capacity but by some firmware supported control mechanisms, the exploitation may be limited. For example, only 3 of 12 physical processors may be enabled for execution.
The enablement definition data (i.e., how each processor is to function) is stored in a device that is part of the respective hardware unit. Typically, during system initialization, the totals per enabled hardware entities are calculated by type. The actual allocation of resources at the system level does not have to reflect the enablement definition data per hardware unit, but can be allocated on any of the available physical hardware units of the respective type, just the system totals have to be respected.
In case a single hardware unit of such a system comprised of multiple identical hardware units breaks, the enablement definition data of the broken hardware unit can still be assumed accessible. The enablement definition data of the broken unit can still be respected at the system level if enough physical resources of the respective type are available on other hardware units providing the same type of physical resources. For best system availability, it may be recommended to plug as much physical resources per type into the system such that a complete loss of a single hardware unit still leaves enough physical capacity in the system to fulfill the needs according to the system totals of the enablement definitions as defined across the multiple hardware units.
Even though the broken hardware unit may not have any healthy physical capacity, it still carries the enablement definition data. By moving the broken hardware unit to a different system, the enablement definition data is moved to the target system. If it has unused physical resources, the addition of the broken hardware unit would enable physical resources from the pool of unused physical hardware. For certain reasons, the hardware manufacturer or distributor may not want substandard substitute components for broken hardware that simply deals with the enablement definition data. Therefore, there is a need for a novel technique of identifying resources of a defective hardware unit in a computing system.