This invention relates to a multi-processor system in which first and second processing sets (each of which may comprise one or more processors) communicate with an I/O device bus.
The application finds particular application to fault tolerant computer systems where two or more processor sets need to communicate with an I/O device bus in lockstep with provision for identifying lockstep errors in order to detect faulty operation of the system as a whole.
In such a fault tolerant computer system, an aim is not only to be able to identify faults, but also to provide a structure which is able to provide a high degree of system availability. In order to provide high levels of system availability, it would be desirable for such systems automatically to attempt recovery from a lockstep error.
Automatic recovery from a lockstep error provides significant technical challenges in that the system has not only to detect the absence of lockstep, but also to provide an environment where the system as a whole can continue to operate.
Accordingly, an aim of the present invention is to address these technical problems.