The parent application identified above describes a new cluster architecture for high-speed computer processing systems, referred to as supercomputers. For most supercomputer applications, the objective is to provide a computer processing system with the fastest processing speed and the greatest processing flexibility, i.e., the ability to process a large variety of traditional application programs. In an effort to increase the processing speed and flexibility of supercomputers, the cluster architecture for highly parallel multiprocessors described in the previously identified parent application provides an architecture for supercomputers wherein a multiple number of processors and external interface means can make multiple and simultaneous requests to a common set of shared hardware resources, such as main memory, secondary memory, global registers, interrupt mechanisms, or other shared resources present in the system.
One of the important problems in designing such shared resource, multiprocessor systems is providing an effective mechanism for quickly interrupting processors in the event of a hardware exception or processor breakpoint. Parallel processing software running on present multiprocessor systems is sometimes very difficult to debug in the event of a failure. Because of the lack of an effective fast interrupt mechanism in present supercomputers, it is difficult to stop all of the processors in the community associated with the parallel process. The result is that one or more of the non-stopped processors may destroy the information necessary to identify the failure because they continue to operate for many hundreds, or even thousands of clock cycles after the failure is detected by the failing processor.
For example, most prior art massively parallel computer processing systems utilize a wavefront technique for interrupting the processors. The processor that encounters the exception or that generates the interrupt is treated as the center of the wave and the interrupt is propagated out from that processor according to the particular interconnection architecture for the system. In this model, the processors on the outermost edge of the system will not be interrupted until the interrupt wave reaches them, a time period that may be quite variable.
Another problem with many of the present interrupt mechanisms for multiprocessor systems is that all of the processors in the multiprocessor system are unconditionally interrupted in the event of a fault, not just the processors associated with a process group. The disadvantage to this technique is that all programs executing on the multiprocessor system are halted, not just the processes associated with the program experiencing the fault.
Two associated problems arise out of the inability to adequately direct signal requests to one or more processors. With many of the present interrupt mechanisms for supercomputers, for example, the Cray-1 and Cray X-MP supercomputers available from Cray Research, Inc., the interrupt mechanisms do not allow an intelligent peripheral to target a service request to a particular processor that could best field the service request. A second problem is that if a peripheral or processor requires the attention of more than one other processor, multiple sequential signals or interrupts must be sent, one for each of the processors being requested.
Although the prior art interrupt mechanisms for multiprocessor systems are acceptable under certain conditions, it would be desirable to provide a more effective interrupt mechanism for a multiprocessor system that was able to interrupt the execution of all processors in a community of associated processors within a bounded number of clock cycles from an interrupt. In addition, it would be desirable to provide an interrupt mechanism for the cluster architecture for the multiprocessor system described in the parent application that aids in providing a fully distributed, multithreaded software environment capable of implementing parallelism by default.