Servers running on mission-critical systems require high availability and flexible resource operation. For such requirements, ASICs (Application Specific Integrated Circuits) and firmware (hereinafter, F/W) provide a physical partitioning function of dividing resource allocations, which have conventionally been fixed to hardware (hereinafter, H/W) (such as a processor and a memory), into n partitions and using the same with respective different OSes (Operating Systems). Such a function enables flexible resource operation without H/W restrictions.
Functions for precise analysis and notification of fault information are required when using the physical partitioning function as well as when not. For that purpose, a fault-managing function is implemented along with a fault detection function equivalent to when not using the physical partitioning function. Possible methods for managing a fault when using the physical partitioning function are broadly classified into the following three.
(Management method 1) Implement all the functions of H/W resource allocation and information assignment on an ASIC.    (Management method 2) Implement the functions of H/W resource allocation and information assignment on an ASIC and via F/W in a cooperative manner depending on the respective characteristics.    (Management method 3) Implement all the functions of H/W resource allocation and information assignment via F/W. (=virtualization)
In consideration of reliability, implementation, cost, and adaptability to other functions, the following will deal with the management method 2 which has less impact on the partitions under a H/W fault and allows more flexible functional enhancements. Here, the ASIC manages H/W, partitions H/W resources, and provides resource management information including fault information to the F/W. The F/W analyses the resource management information as needed, and provides the fault information on the partitions to upper layers such as an OS.
Incidentally, the related techniques include a fault processing system in which fault processing apparatuses corresponding to respective first and second groups of data processing apparatuses switch between and input pieces of fault information from the first and second groups of data processing apparatuses (for example, see patent document 1).
[Patent Document 1] Japanese Laid-Open Patent Publication No. 01-050135
When a fault occurs during server operation, the fault information is mostly stored away to reduce the fault-handling time. The fault information, however, is not stored in some cases such as when a multiple fault occurs or because of a fault in a fault reporting path, an ASIC or F/W fault that is unpredictable at design time, etc.
The situations where fault information is not stored include an address fault occurring in a memory that is used by the F/W, and a fixed fault occurring in an area where processor information is saved.