The present invention relates generally to information processing systems and more particularly to a methodology and implementation for processing detected fault conditions in transactions from adapter devices.
In all computer systems, devices connected within the system are generally able to communicate and initiate data transfer transactions with other devices in the system as well as with the system memory, system processors and other system central components. These transactions transpire in the form of one or more lines of information being passed from one device in a system to another device in the system. In a specific example, current PCI (peripheral component interconnect) computer systems are able to have many PCI bridge circuits connected between a main system bus and a plurality of PCI busses. Each PCI bus, in turn, may have several adapter devices connected thereto. For large systems, this tree-like configuration can become quite complex and extensive.
In transferring information between system components such as system memory to or from any of the adapter devices, or between any two adapter devices in the computer system, segments or lines of information are placed on system busses between the devices participating in the transaction in a predetermined sequence. The transfer of information from one device to another generally occurs in discrete steps with stops along the way. The information being transferred may, for example, move from one adapter device on one PCI bus to system memory. In an extensive computer system, that journey may pass through several bridge circuits along the way, and the information may be temporarily stored in transit buffers at each of the bridge circuits. Among other things, this step-by-step transaction process allows for a prioritization and/or ordering system in which certain transactions are able to bypass other transactions.
If, however, an error occurs on one of the busses involved in a transaction, it may result in a system error report that is effective to terminate all system operations. For example, in a PCI environment, if a transaction is clear on a primary bus of a bridge, and an error occurs on the secondary bus, then a PCI xe2x80x9cSERRxe2x80x9d signal is generated which causes a system shut-down rather than risk the propagation of erroneous data caused by the detected error condition.
Thus, all devices in the system as well as the system itself may be totally shut-down because of an easily correctable error condition in only one of the adapter devices in the system.
Thus, there is a need for an improved methodology and implementing system which enables an identification and isolation of specific adapter devices which are detected to have caused detected error conditions in a computer system.
A method and implementing computer system is provided in which specific device identification information is acquired when a faulty condition is detected during an information transfer transaction, and the condition is reported for corrective action without initiating a system shut-down. In an exemplary PCI system, the PCI adapter sequence information, including tag number, requester bus number, requester device number and requester function number is captured and used in reporting an error condition to the adapter""s device driver in order to identify and isolate the adapter in a recovery operation.