1. Cross-Reference to Related Applications
The following co-pending patent applications are assigned to the same assignee of the present application and are related to the present application: "Router Chip with Quad-Crossbar and Hyperbar Personalities" by John Zapisek (M-867) filed concurrently herewith and assigned Ser. No. 07/926,138 which is a continuation of Ser. No. 07/461,551, now abandoned and "Scalable Inter-Processor And Processor To I/O Messaging System For Parallel Processing Arrays" by John Nickolls et al. (M-881) filed concurrently herewith and assigned Ser. No. 07/461,492, now U.S. Pat. No. 5,280,474, issued on Jan. 18, 1994. The disclosures of these concurrently filed applications are incorporated herein by reference.
2. Field of the Invention
The invention relates generally to parallel data processing systems and more specifically, to a wiring network for interconnecting router chips within a parallel computer system wherein data is routed from source processor elements to destination processor elements.
3. Description of the Relevant Art
Maximizing the data processing speed of computer systems has been a primary goal in the development of computer systems. Extensive effort and resources have been devoted to increasing the speed of conventional, single-processor computer systems which are referred to as Von Neumman machines. Semiconductor processing technology has continuously improved to the point where current microprocessors are approaching theoretical limits in density of features and circuit speed.
As an alternative to conventional, single-processor computer systems, parallel computer systems having multiple processors which simultaneously process data have been proposed. These parallel computer systems comprise several processors or "processor elements" which receive and process data simultaneously. A so-called "massively parallel" computer system may have 1,000 processor elements or more operating simultaneously, and the amount of data which can be processed during a single instruction cycle can be made many times greater than the amount which can be processed by a single-processor computer system.
A problem common to parallel computer systems has involved the development of a communication scheme which allows data to be quickly transferred between processor elements. Data routing circuitry has been designed for routing data from a selected source processor element to a selected destination processor element. Basic parts of the data routing circuitry of a parallel computer system may be manufactured on a single integrated circuit chip called a router chip. A typical router chip has a multiplicity of input terminals, each of which is connected to a route granting device and also a multiplicity of output terminals, each of which is connected to a destination device.
When a large number of processing elements (i.e. more than 1000) are to be interconnected within a parallel computer system, it becomes impractical or impossible to provide the circuitry for an entire routing system on one integrated circuit chip. Consequently, the circuit is partitioned and several router chips or elements are implemented in stages to provide a communications path between a message-originating processor element and a message-receiving processor element.
The stages of router elements are preferably interconnected by a wiring network which allows any processor element to communicate with any other processor element within the parallel computer. DEC (Digital Equipment Corp. of Massachusetts) has developed a multistage crossbar type of network for allowing clusters of processor units to randomly communicate with other clusters of processor units. The DEC crossbar system is described in PCT application WO 88/06764 of Grondalski which was published Sep. 7, 1987 and is based on U.S. patent application Ser. No. 07/018,937, now abandoned. The disclosures of the Grondalski applications are incorporated herein by reference.
Ideally, messaging should occur in parallel so that multiple processor elements are exchanging information simultaneously. If, however, sets of data from more than one processor element (PE) are directed to the same input wire or bus of a destination processor element during one data transfer cycle, contention occurs. The data from one of the message-sending processor elements is blocked and must be retransmitted after the completion of transmission of the data set from the other message-sending processor element. In addition to this contention mechanism, there are a limited number of wires within the routing network. If the number of processing elements wishing to send messages is more than the number of router wires, the transmission of data from one processor element may have to be delayed while the transmission of data from another processor element passes through a choke point even though the data sets are being routed to different destination processing elements. This is known as internal channel "blockage" or internal contention. When channel contention occurs, the data set from one of the processing elements can not transfer to the destination processing element until after the data from the contending processing element passes through. Channel contention is undesirable because it increases messaging time for the system as a whole.