1. Technical Field
The present invention relates generally to an improved data processing system, and in particular a method and apparatus for handling errors. Still more particularly, the present invention provides a method and apparatus for reporting errors in a data processing system.
2. Description of Related Art
Data processing systems have become more complex. This complexity includes various types of resources in a data processing system. For example, a data processing system may include one or more architecturally distinct processors. In this type of system, multiple host bridges may be present for the numerous I/O adapter bus slots. These type of systems may be run in a partitioned or non-partitioned mode. In a partitioned mode, resources are allocated among different copies of an operating system or multiple heterogenous operating systems, which are run simultaneously on the data processing system. Such a partitioned data processing system is also referred to as a logical partitioned data processing system or as a LPAR data processing system.
In this type of complex multi-processor, multi-host-bridge system, when an I/O error occurs, it is desirable to isolate that error from the rest of the logical partitioned data processing system to allow the system to function without corrupting data in the system. Currently, this isolation is accomplished by preventing memory mapped input/output (MMIO) accesses to propagate from a host processor to I/O adapters beneath the host bridge in error state. This isolation is also accomplished by preventing direct memory access (DMA) accesses from propagating from an I/O adapter through the host bridge to system memory. DMA is an access in which an adapter attempts to send data to a resource, such as a memory. MMIO is a type of access in which a processor attempts to access an adapter. By isolating the system to a host bridge level, the rest of the system is able to continue to operate or at least enter into an error state that can later be analyzed and recovered.
One problem with this current solution is that in the process to reporting users to an error, a resource, such as chips and/or memory that the host processor needs to access to generate an error report, is located below the host bridge, which is in the error state. In other instances, a support processor for the system may be located below the host bridge in the error state. With the host bridge being isolated, certain support processor activities will be unable to complete.
Therefore, it would be advantageous to have an improved method, apparatus, and computer instructions for reporting errors when resources located below a host bridge need to be accessed to gather error information or transfer error information.