1. Technical Field
The present invention relates in general to error reporting and fault isolation in a system or I/O bus and in particular to error reporting and fault isolation in a bidirectional or multi-drop bus. Still more particularly, the present invention relates to a method of reporting and handling parity errors or internal errors in a bidirectional bus in a manner which isolates the source of the error for a machine check mechanism.
2. Description of the Related Art
Errors on system and I/O buses within a data processing system are unavoidable. Such errors may be attributable to a variety of reasons, including failure of a driving circuit to assert a particular bus conductor, internal errors such as a timeout within a device connected to the bus, or conflicting attempts by bus devices to drive or master the bus. These errors occur regardless of whether the bus is multiplexed or includes separate address and data lines.
Various mechanisms have been developed for detecting and/or isolating bus errors or fault conditions, ranging from simple parity checking to dedicated monitoring mechanisms. Detection and isolation of bus errors is essential to recovery from the fault condition or for generating status reports if user intervention is required to correct the fault condition. Error detection and isolation in modern data processing systems is becoming increasingly difficult due to the use of multi-level bus hierarchies, high speed local buses, bidirectional or multi-drop buses, multiplexed buses, and complex bus protocols.
Bidirectional or multi-drop buses, those which can be driven by multiple sources, are by nature subject to various phenomena such as crosstalk, signal reflection, parity errors, etc. Because the bus includes multiple sources, error isolation is difficult. Depending on the nature of the error and manner in which it manifests itself, isolation of the source of a fault condition in a bidirectional bus may be impossible without some mechanism for monitoring the source of errors or events surrounding the occurrence of errors. Complicated error reporting, fault isolation, or fault capture mechanisms increase the cost and complexity of a data processing system and may introduce undesirable latency or delays.
If bidirectional buses are multiplexed to use the same bus conductors for both address and data transmission, or subject to stringent transmission protocols and signaling requirements, isolation of fault conditions may be further complicated. On some such bidirectional buses, such as SCSI and PCI buses, arbitration phases are implemented requiring a device to win control of the bus before initiating a data transaction. Such protocols purportedly eliminate errors due to conflicts among multiple devices, but in reality are not flawless.
It would be desirable, therefore, to provide bidirectional buses, where multiple devices connected to the bus may be the source of an error, with a mechanism for error reporting and fault isolation. Such a mechanism should be simple and introduce a nominal latency into the machine check mechanism of the data processing system.
It would further be desirable if errors could be isolated in regards to source, to a specific device connected to the bus, and if some indication as to the reason for the error could be provided. A method for capturing errors in such a manner should be easily adaptable for use in either a system or I/O bus, as well as in multiplexed buses or buses with separate address/data lines. Such an error reporting and isolation mechanism would provide a valuable resource to the system machine check mechanism of a data processing system for identifying the cause of errors for system recovery or for system status reports.