1. Technical Field
The present invention relates in general to injecting device errors and in particular to injecting device errors during selected load and store operations. Still more particularly, the present invention relates to preventing a selected load and store operation from getting to a device by detection of which device is the target of the selected load or store operation and injecting specific errors to that particular device in and operating system error recovery code to test the device driver path for those errors.
2. Description of the Related Art
Many data processing or computer systems support a standard input/output (I/O) systems conforming to the peripheral component interconnect (PCI) Local Bus architecture, an architecture supporting many complex features including I/O expansion through PCI-to-PCI bridges, peer-to-peer (device-to-device) data transfers, multi-function devices, and both integrated and plug-in devices. In setting up I/O operations to I/O devices on a PCI bus, the device driver must perform a series of load and/or store operations to the I/O device. If any of these operations gets a parity error on the I/O bus, it is necessary to get this information back to the device driver so that the device driver can stop before the operation is initiated.
As an example, a first store operation may be employed to set up an address in the I/O device, followed by a second store operation signalling the I/O device to begin the data transfer. If the first store operation gets an error and the second store operation is then received, the I/O device might start the operation to the incorrect location. The PCI architecture includes no provision for designing adapters to prevent load and/or store operations from continuing after an error. Most contemporary systems allow device driver execution to continue after a store operation rather than wait for a "successful" response to the store operation to determine if it completes correctly. This is preferable since the processor stall required to wait for a response to store operations would vastly degrade system performance. Currently, I/O adapters have the capability to detect parity errors on the I/O bus and recover from them.
One technique allowing the device driver to prevent subsequent load and/or store operations from completing after an error without waiting for the response to every load or store operation is to have the device select lines from each I/O device be brought into a PCI host bridge individually so that the device number of a failing device may be logged in an error register when an error is seen on the PCI bus. Until the error register is reset, subsequent load and store operations are delayed until the device number of the subject device may be checked against the error register. If the subject device is a previously failing device, the load/store operation to that device is prevented from completing, either by forcing bad parity or zeroing all byte enables. By forcing bad parity or zero byte enables, the I/O device will respond to the load or store request by activating its device select line, but will not accept store data. Operations to devices which are not logged in the error register are permitted to proceed normally, as are all load store operations when the error register is clear. However it is one thing to generate the device driver code to recover from errors and quite another thing to test and debug the code paths, which handle the errors.
In the past, special test I/O adapters have been developed to inject errors onto a bus in order to attempt to test device driver error paths in a development environment. However, these special test adapters have the drawback that they are not shipped with the computer system, and therefore are not available to all device driver writers. Additionally, in order to inject an error, these adapters usually compare on the address of the operation and inject an error after the address has been detected. This error injection technique has the disadvantage in that randomization of errors is not possible and that the I/O adapter has to be set up with an address which will correspond to an address of the device with which to have the error injected upon. Lastly, if multiple devices are to be checked out at the same time, a separate special I/O adapter for each bus in the system is required.
It would be desirable, therefore, to provide a method and system for injecting errors during bus operations in a computer system to a device which does not require a specific address to be set up to correspond to an address of the device which is to have the error injected. It would also be advantageous for the mechanism to provide randomization of errors to be injected while simultaneously not requiring a separate I/O adapter for each bus in a computer system when testing multiple devices on different buses.