Various high-speed computer processing systems, sometimes referred to as supercomputers, have been developed to solve a variety of computationally intensive applications, such as weather modeling, structural analysis, fluid dynamics, computational physics, nuclear engineering, real-time simulation, signal processing, etc. The overall design or architectures for such present supercomputers can be generally clasified into one of two broad categories: minimally parallel processing systems and massively parallel processing systems.
The minimally parallel class of supercomputers includes both uniprocessors and shared memory multiprocessors. A uniprocessor is a very high-speed processor that utilizes multiple functional elements, vector processing, pipeline and look-ahead techniques to increase the computational speed ofhte single processor. Shared-memory multiprocessors are comprised of a small number of high-speed processors (typically two, four or eight) that are tightly-coupled to each other and to a common shared-memory using either a bus-connected or direct-connected architecture.
At the opposite end of the spectrum, the massively parallel class of supercomputers includes both array processors and distributed-memory multicomputers. Array processors generally consist of a very large array of single-bit or small processors that operate in a single-instruction-multiple-data (SIMD) mode, as used for example in signal or image processing. Distributed-memory multicomputers also have a very large number of computers (typically 1024 or more) that are loosely-coupled together using a variety of connection topologies such as hypercube, ring, butterfly switch and hypertrees to pass messages and data between the computers in a multiple-instruction-multiple-data (MIMD) mode.
Because of the inherent limitations of the present architectures for minimally parallel and massively parallel supercomputers, such computer processing systems are unable to achieve significantly increased processing speeds and problem solving spaces over current systems. The related applications identified above entitled CLUSTER ARCHITECTURE FOR A HIGHLY PARALLEL SCALAR/VECTOR MULTIPROCESSOR SYSTEM sets forth a new cluster architecture for interconnecting parallel processors and associated resources that allows the speed and coordination of current minimally parallel multiprocessor systems to be extended to larger numbers of processors, while also resolving some of the synchronization problems associated with massively parallel multicomputer systems. This range between minimally parallel and massively parallel systems will be referred to as highly parallel computer processing systems and can include multiprocessor systems having sixteen to 1024 processors. The cluster architecture described in the related application provides for one or more clusters of tightly-coupled, high-speed processors capable of both vector and scalar parallel processing that can symmetrically access shared resources associated with the cluster, as well as shared resources associated with other clusters.
Just as the traditional system architectures were ill-suited for solving the problems associated with highly parallel multiprocessor systems, so too are the traditional control and maintenance architectures. As used within the present specification, the terms control and maintenance refer to any operation by which a system operator can control the operation of the system such as starting, stopping, or n-stepping the master clock, setting or sensing internal machine states, executing diagnostic routines, and capturing errors at run-time for later display and analysis.
Prior art control and maintenance architectures include the use of scan paths for setting and/or sensing critical internal machine parameters. Control of the scan paths is typically via an external maintenance or diagnostic system. As computer execution speeds increase and systems become more densely packaged, physical access to critical internal machine parameters becomes more difficult, accentuating the need for remote electronic access to these parameters.
In highly parallel multiprocessor systems, the packaging density of the design requires that all internal machine registers be accessible to a control and maintenance subsystem. High performance systems use high clock speeds, requiring an increased packaging density, which in turn renders physical access to the system for sensing with traditional test equipment such as oscilloscopes and logic analyzers very difficult if not impossible. In addition, these traditional diagnostic tools may well be incapable of operating at a high enough speed to be useful.
Furthermore, the complexity of a highly parallel multiprocessor system makes analysis of failing machine sequences extremely difficult unless all internal registers can be sensed by the maintenance subsystem. The amount of information that must be retrieved from a highly parallel multiprocessor system undergoing diagnostic testing is massive, and easily exceeds the capability of traditional scan path architectures. Access to all internal machine registers is also necessary to provide the system with the ability to restart from a specific machine state, such as after stopping the machine in an error situation.
The ability to stop and restart the machine necessarily requires that the maintenance subsystem have the ability to control all processor clocks. In addition, a highly parallel multiprocessor architecture requires that control over processor machine states and clocks be independent. This is necessary for removing a defective processor from operation without halting operation of the entire system. By the same reasoning, it is advantageous for the maintenance system to have control over the power up sequence of each processor, so that a defective processor may be removed from operation, repaired, and restored to operation with minimal impact on the rest of the system.
In the same way that it is undesirable for maintenance work on one processor to halt operation of the entire system, so is it undesirable for maintenance work on one peripheral device to halt operation of the entire system. Thus it is desirable for a control and maintenance subsystem to have independent control over peripheral devices, including their on-line status and power up sequence.
It is clear that there is a need for a control and maintenance architecture specifically designed for the needs of a highly parallel multiprocessor system. Specifically, there is a need for a maintenance subsystem allowing setting and sensing capability for all internal machine registers, the ability to set and sense machine states by management of massive amounts of information, independent control of processor power up sequences, processor clocks, processor machine states, and peripheral devices.