1. Field of the Invention
The present invention generally relates to digital communications systems and, more particularly, to a high performance crossbar switch which uses contention detection at the destination and reroutes colliding messages over an alternate path provided by a second interconnection network with contention resolution capability.
2. Description of the Prior Art
High performance, multi-processor computer systems are characterized by multiple central processor units (CPUs) operating independently, but occasionally communicating with one another or with memory devices when data needs to be exchanged. The CPUs and the memory devices have input/output (I/O) ports which must be selectively connected to exchange data. The data exchanges occur frequently but at random times and occur between random combinations of CPUs and memory devices. Therefore, some kind of switching network is required to connect the ports for the relatively short period of the data exchange. This switching network must provide a high bandwidth so that the processing is not unduly delayed while the data is being exchanged. Furthermore, the connections are frequently made and broken, and delays that occur while waiting for a connection or delays incurred while the connection is being made can also impact the total capability of the parallel CPUs.
FIG. 1 is an illustration of one type of computer system to which the subject invention is directed. There are a large number of CPUs 10, each operating independently and in parallel with each other. In the past, it has been common to have the number N of parallel CPUs to be in the neighborhood of four. However, newer designs involve greater numbers N of CPUs of 256 (2.sup.8) to 1,024 (2.sup.10), or even greater. Each of the CPUs 10 occasionally requires access to one of the several memory devices 12. For the sake of illustration, the memory devices will be assumed to be equivalent and also of number N. Each CPU 10 has an I/O path 14 and each memory device 12 has an I/O path 16. The paths 14 and 16 can be buses and may be duplicated to provide full-duplex communication. The important consideration, however, is that a CPU 10, requiring access to a particular memory device 12, have its I/O path 14 connected to the I/O path 16 of the required memory device 12. This selective connection is performed by a switching network 18, which is central to the design for the distributed processing of the computer system illustrated in FIG. 1.
The use of a cross-point switch for the switching network 18 provides the required high bandwidth. The important feature of a cross-point switch is that it can simultaneously provide N connections from one side to the other, each selectively made. Although the complexity of a cross-point switch increases in proportion to N.sup.2, the relative simplicity of the actual N.sup.2 cross-points allows its fabrication in a currently available technology.
Christos J. Georgiou has described in U.S. Pat. No. 4,605,928 a cross-point switch composed of an array of smaller cross-point switches, each on a separate integrated circuit (IC). Although Georgiou describes a single-sided switch, as opposed to the double-sided switch of FIG. 1, Georgiou's switch can be used in the configuration of FIG. 1, or easily adapted thereto. With the cross-point switch of Georgiou, it is easily conceivable that the number N of ports to the switch can be increased to 1,024. Thus, the total bandwidth of the switch 18 would be 1,024 times the bandwidth of the transmission paths 14 and 16. The cross-point switch of Georgiou has the further advantage of being non-blocking. By non-blocking what is meant is that if a CPU 10 requires that its I/O path 14 be connected to the I/O path 16 of a memory 12 not currently connected, the switch 18 can provide that connection. Thus, a CPU 10 is not blocked by the switch 18 when it requires a connection to a memory device 12.
Georgiou has also described, in another U.S. Pat. No. 4,630,045, a controller for his cross-point switch. Georgiou's controller is designed to be very fast but it suffers from the deficiency of most cross-point switches that one controller is used for all N input ports. As a result, the controller must sequentially service multiple ports requesting connection through the cross-point switch. Therefore, once the demanded connection rate exceeds the speed of the controller, the controller becomes a bottleneck. This is because the controller is a shared resource. Even if the controller of Georgiou were redesigned to provide parallel subcontrollers, perhaps attached to each port, this parallel controller would nonetheless be dependent upon a single table, the port connection table, that keeps track of available connections through the switch. Thus, the port connection table is also a shared resource and limits the controller's speed for large values of N.
An alternative to the cross-point switch is the Delta network. Delta networks are defined, with several examples, by Dias et al. in an article entitled "Analysis and Simulation of Buffered Delta Networks", IEEE Transactions on Computers, vol. C-30, no. 4, April 1981, pp. 273-282. Patel also defines a Delta network in "Performance of Processor-Memory Interconnections for Multiprocessors", IEEE Transactions on Computers, vol. C-30, no. 10, October 1981, pp. 771-780. An example of a Delta network for packet switching is described by Szurkowski in an article entitled "The Use of Multi-Stage Switching Networks in the Design of Local Network Packet Switching", 1981 International Conference on Communications, Denver, Col. (June 14-18, 1981). The Delta network will be described here with reference to the Omega switching network, described by Gottlieb et al. in an article entitled "The NYU Ultracomputer--Designing an MIMD Shared Memory Parallel Computer", IEEE Transactions on Computers, vol. C-32, no. 2, February 1983, pp. 175- 189. This example is illustrated in FIG. 2.
In FIG. 2, there are eight ports on the left, identified by binary numbers, and eight ports on the right, similarly identified by binary numbers. Connecting the right hand and the left hand ports are three stages of switches 20. Each switch 20 is a 2.times.2 switch that can selectively connect one of the two inputs on one side to one of the two outputs on one side to one of the two outputs on the other side. The illustrated Delta network can provide a connection from any port on the right hand side to any port on the left hand side. Data is transmitted from one side to another in relatively small packets containing, in addition to the data, control information, including the address of the desired destination. By use of buffers within the switches 20, it is possible to decouple the switches of the different sections so that the control and transmission are pipelined between the stages of the 2.times.2 switches 20. Thus, the control function of the Delta network is potentially very fast and the delay introduced by the stages rises as a function of logN rather than the N dependence of the cross-point switch. However, the Delta network is a blocking network; that is, there is no guarantee that a connection path is available through a switch even if the desired output port is otherwise available. Thus, a Delta network is potentially fast, but as traffic increases, blocking delays can be expected.
Peter A. Franaszek discloses in U.S. Pat. No. 7,752,777 a switching system which combines the features of a cross-point switch and a Delta switching network by providing a non-blocking cross-point switch for data transmission and by additionally providing a Delta network switch for switching control information between the input and output ports of the cross-point switch. FIG. 3 illustrates the basic design of the Franaszek switching system for the case where N is four. Each input port is connected to a respective input adaptor 30 and each output port is connected to an output adaptor 32. A cross-point switch 34 has four horizontal lines 36 connected to the input adapters 32. At each intersection of a horizontal line 36 and a vertical line 38 is a cross-point that is individually selectable to make the connection between the respective horizontal line 36 and a vertical line 38. A cross-point controller 40 is associated with each horizontal line 36 to control the cross-points of that horizontal line 36. This arrangement is horizontally partitioned because the controllers are associated with the input ports rather than the output ports. Each cross-point controller 40 is itself controlled by associated input adaptor 30.
The cross-point switch 34 is used primarily for the selective transmission of data while a separate Delta network 42 is used primarily for the selective transmission of control information between the input adapters 30 and the output adapters 32. For N equal to four, two stages, each with two 2.times.2 switches 44, are required. The Delta network differs from that of FIG. 2 because each switch 44 has its own buffering and the adapters 30 and 32 also require buffering. The fundamental problem in controlling the cross-point switch 34 is to ascertain whether the desired horizontal line 36 and vertical line 38 are available. The controller 40 of the horizontally-partitioned cross-point switch is easily able to decide if its associated horizontal line 36 is available, but it is more difficult for the controller 40 to know if the desired vertical line 38 is available or whether another controller 40 has connected a different cross-point to the desired vertical line 38. The Delta network 42 provides the means of obtaining this information.
When an input adaptor 30 receives a request from its input port I.sub.0 -I.sub.3 for a connection to a designated output port O.sub.0 -O.sub.3, the input adaptor 30 directs this request through the Delta network 42 to the designated output adaptor 32. The adaptor 32 keeps a record of the use of its associated vertical line 38. The request that the input adaptor 30 transmits to the output adaptor 32 is in the form of a control message S.sup.C.sub.ij, where i is the number of the input adaptor 30 and j is the number of the output adaptor 32. A control message S.sup.R.sub.ij returned to the input adaptor 30 from the output adaptor 32 provides information as to the time at which the input adaptor can initiate the sending of the message to the output adaptor. When that time arrives, the input adaptor instructs its associated controller 40 to make the cross-point connection (ij) in the cross-point switch 34 and the input adaptor 30 then proceeds to send its message. At the same time, the output adaptor 32 has prepared itself to receive the message designated by the senior member of the reservation queue.