1. Field of Use
The present invention relates to methods and apparatus for improving the resiliency of a data processing system to system errors or faults and, in particular, to a method and apparatus to allow a memory to retry a transmission of requested data upon receipt of an improper bus transfer operation response upon attempting to initiate a data read.
2. Prior Art
A recurring problem in data processing systems is that of providing the system with the capability, resiliency, to optimize the operation of the system upon the occurrence of system errors or faults. Such faults are well know and frequent and, while an error or fault may be of a temporary or non-fatal nature, often cause substantial disruption to the system operations. The problem, therefore, is to provide the system with a means for responding to such errors or faults in such a manner as to allow the system to continue operation in as normal a manner as possible, without, however, allowing an attempt to continue system operations to lead the system into further faults, for example, allowing the system to become trapped in attempting to repeat and complete an operation that, because of a fault, cannot be completed.
This problem is particularly acute in the case of memory operations because the memory is most probably the busiest element in the system, being the source and receptacle of all data and programs. Therefore, while it is know to allow other system elements to repeat an attempted operation if the operation has failed on a first attempt, this is not done in memory operations due to the risk of tying up memory or delaying access to memory to other elements of the system. For example, a central processing element may make a request to memory for data and the memory will attempt to provide the data to the central processor, but often will discover that the central processor cannot accept the data. The central processor's inability to accept the data transfer from memory is often of a temporary nature, for example, it is handing an interrupt for a higher priority operation, or may be of a more serious nature.
The usual response in such cases is that the memory will cancel the memory request and proceed to service other requests. It is then necessary for the system element requesting the data from memory to re-submit the data request at some later time. While this approach is common in the prior art, and optimizes the probability that any given element in the system will be able to gain access to the memory, it may result in greater loss of system operation in that the operations requiring the data must halt until the data can again be requested.