1. Technical Field
The present invention relates in general to data processing and, in particular, to an interconnect of a data processing system. Still more particularly, the present invention relates to data processing systems of processing nodes having recovery methods. The nodes can be arranged to operate either in a multi-node data processing system having a non-hierarchical interconnect architecture topology, or on a different topology, such as over a common hierarchical bus.
2. Description of the Related Art
It is well-known in the computer arts that greater computer system performance can be achieved by harnessing the processing power of multiple individual processors in tandem. Multi-processor (MP) computer systems can be designed with a number of different architectures, of which various ones may be better suited for particular applications depending upon the intended design point, the system's performance requirements, and the software environment of each application. Known architectures include, for example, the symmetric multiprocessor (SMP) and non-uniform memory access (NUMA) architectures. Until the present invention, it has generally been assumed that greater scalability and hence greater performance is obtained by designing more hierarchical computer systems, that is, computer systems having more layers of interconnects and fewer processor connections per interconnect.
The present invention recognizes, however, that such hierarchical computer systems incur extremely high communication latency for the percentage of data requests and other transactions that must be communicated between processors coupled to different interconnects. For example, even for the relatively simple case of an 8-way SMP system in which four processors present in each of two nodes are coupled by an upper level bus and the two nodes are themselves coupled by a lower level bus, communication of a data request between processors in different nodes will incur bus acquisition and other transaction-related latency at each of three buses. Because such latencies are only compounded by increasing the depth of the interconnect hierarchy, the present invention recognizes that it would be desirable and advantageous to provide an improved data processing system architecture having reduced latency for transaction between physically remote processors.
The present invention additionally recognizes that from time to time errors occur in processing in data processing systems even those operating in high speed, high frequency bandwidth topologies. Normally, it would be expected that a system processing error in such topologies would cause an overall system failure, requiring a time consuming effort for system recovery at the high frequency. It would thus be desirable to prove a method and system for more robust recovery in high speed, high bandwidth data processing systems.