The present invention relates to computers and more particularly to computers providing parallel processing capability through the use of multiple processors which share a single large memory. As is understood by those skilled in the art, certain mathematical problems exhibit a high degree of parallelism, particularly those which involve the manipulation of large arrays or matrices. Such problems can be broken down into computational segments each of which can be performed by a separate processor.
As is also understood, it is becoming increasingly difficult to significantly improve the power or performance of single processor computers. The increase in cost associated with increases in speed are often disproportionate and real physical limits are being strained in terms of the physical size of the apparatus in relation to the speed of propagation of electrical signals. Accordingly, various proposals and developments have been undertaken to implement multiple processor or parallel processing computers in which the total power of the machine is increased by multiplying the number of processors rather than by increasing the power of any single processor. Some of the approaches proposed have utilized a common bus structure through which all processors communicate with a shared memory. Other approaches utilize a shared memory which the various processors communicate with through a switching network. Examples of this latter type of computer include the Butterfly computer manufactured and sold by Bolt Beranek and Newman Inc. of Cambridge, Mass. and the Connection Machine manufactured and sold by Thinking Machines, Inc. of Cambridge, Mass. The architectures of the Butterfly computer and its switching network are described in BBN Report Nos. 3501 and 4098 (Chapter III) to the Defense Advanced Research Projects Agency (DARPA). The Connection Machine architecture is described in U.S. Pat. No. 4,598,400 issued July 1, 1986 to W. Daniel Hillis.
The present invention relates in large part to an improved parallel processing architecture in which a multiplicity of processors are synchronized to issue memory requests only at the same predetermined time within a computational cycle or frame interval, the requests being issued as bit serial messages. The initial data in the bit serial messages define memory addresses and a novel switch network architecture is provided for efficiently communicating requests from any processor to any memory location even though the number of processors and the number of memory locations may be very large.
As is understood, the number of inter-element leads and switch elements can grow disproportionately to the number of processors employed until the cost of the switching network exceeds that of the processors or memory. The nature of this problem is explored in some length in the BBN Report identified previously.
A further difficult consideration which must be addressed in the design of such switching networks is the incidence of contention between requests. With a large number of processors, it is prohibitive to provide a dedicated path from each processor to each memory location. On the other hand, if there is any sharing of paths, there is some statistical chance of contention, i.e. two memory requests trying to utilize the same path segment. The design of the network must allow for resolution of the contention and also assure that all requests are honored within a reasonable period of time. The architecture of the present invention meets these requirements cost effectively even for systems employing very large numbers of processors and a large, completely shared memory. In addition, the architecture and implementation of the present invention ameliorates the effects of such contention as may occur by allowing memory read requests from multiple processors to be effectively combined when those processors are attempting to read the same memory location.