This invention relates to the field of computer systems. More particularly, an apparatus and methods are provided for recovering from errors occurring during input/output operations within a computer system.
Many computer system devices or components, such as network interface units or adapters, storage devices, peripheral devices, and so on, initiate input/output operations using DMA (Direct Memory Access) over a host computer's system bus. Split transactions are often enabled to allow improved access to the bus by bus clients.
When split transactions are enabled for read operations, a single read transaction from the device generates two separate system bus transactions: one to issue the read request, and one to return the requested data. In between the two transactions, the system bus is released for use by other devices. When split transactions are disabled, the system bus would be not be relinquished by a component that issued a read request until the requested data were returned.
When split transactions are enabled for write operations, the device that issues a non-posted write operation releases the system bus once the operation has been transferred to the DMA bridge. If split transactions are disabled, the system bus would not be released until acknowledgement of completion of the non-posted write.
The characteristics of read and non-posted write transactions differ, depending on the architecture of the computer system. For example, different types of system buses, such as PCIe (Peripheral Component Interconnect Express) and HT (Hyper Transport), allow data transfers of different maximum sizes, may involve different expected or allowable latencies, etc. Some systems do not even allow or support non-posted writes.
Traditionally, a device configured to generate read or write transactions over a system bus contained built-in logic for detecting and possibly handling errors that occur during the transactions. The device would have to include logic capable of tracking transactions for every type of system bus to which it may be attached. Alternatively, different versions of the device would be designed and produced for each type of system bus.
Because each system bus transaction is relatively low-level, usually involving the transfer of a small amount of data, one read operation (e.g., to retrieve data to be transmitted in one packet over a network) or one write operation (e.g., to write the contents of a packet received from a network) may require a number of system bus transactions. If the device is only capable of tracking the statuses of a limited number of system bus transactions, the device may stall whenever the total number of transactions in-flight reaches that number.