A computer system generally includes one or more processors, a memory and an input/output system. The memory stores data and instructions for processing the data. The processor(s) process the data in accordance with the instructions, and store the processed data in the memory. The input/output system facilitates loading of data and instructions into the system, and obtaining processed data from the system.
Most modern computer systems have been designed around a "von Neumann" paradigm, under which each processor has a program counter that identifies the location in the memory which contains its (that is, the processor's) next instruction. During execution of an instruction, the processor increments the program counter to identify the location of the next instruction to be processed. Processors in such a system may share data and instructions; however, to avoid interfering with each other in an undesirable manner, such systems are typically configured so that the processors process separate instruction streams, that is, separate series of instructions, and sometimes complex procedures are provided to ensure that processors' access to the data is orderly.
In Von Neumann machines, instructions in one instruction stream are used to process data in a single data stream. Such machines are typically referred to as SISD single instruction/single data) machines if they have one processor, or MIMD (multiple instruction/multiple data) machines if they have multiple processors. In a number of types of computations, such as processing of arrays of data, the same instruction stream may be used to process data in a number of data streams. For these computations, SISD machines would iteratively perform the same operation or series of operations on the data in each data stream. Recently, single instruction/multiple data (SIMD) machines have been developed which process the data in all of the data streams in parallel. Since SIMD machine process all of the data streams in parallel, such problems can be processed much more quickly than in SISD machines, and at lower cost than with MIMD machines providing the same degree of parallelism.
The aforementioned Hillis patents and Hillis, et al., patent application disclose an SIMD machine which includes a host computer, a micro-controller and an array of processing elements, each including a bit-serial processor and a memory. The host computer, inter alia, generates commands which are transmitted to the micro-controller. In response to a command, the micro-controller transmits one or more SIMD instructions to the array, each SIMD instruction enabling all of the processing elements to perform the same operation in connection with data stored in the elements' memories.
The array disclosed in the Hillis patents and Hillis, et al., patent application also includes two communications mechanisms which facilitate transfer of data among the processing elements. One mechanism enables each processing element to selectively transmit data to one of its nearest-neighbor processing elements. The second mechanism, a global router interconnecting integrated circuit chips housing the processing elements in a hypercube, enables any processing element to transmit data to any other processing element in the system. In the first mechanism, termed "NEWS" (for the North, East, West, and South directions in which a processing element may transmit data), the micro-controller enables all of the processing elements to transmit, and to receive, bit-serial data in unison, from the selected neighbor.
On the other hand, in the global router, the data is transmitted in the form of messages, with each message containing an address that identifies the processing element that is to receive the message. The micro-controller controls all of the processing elements in parallel. In particular, the micro-controller enables the processing elements to transmit messages, in bit serial format, from particular source locations in their respective memories, for delivery in the destination processing elements at particular destination locations in the respective memories. If multiple messages have the same destination processing elements, later-delivered messages will be combined with previously-received messages, and accordingly the messages that will be processed by the serial processors after the message transfer operation will be a function of messages previously received by the processor.