With the declining cost of computer hardware, such as microprocessors and memory, and the increasing complexity of problems that require solution by a computer, parallel computing is becoming increasingly important. Parallel computers typically use tightly coupled multiprocessors, a collection of microprocessors that are interconnected by cables and run under a single operating system. This is in contrast to loosely coupled multicomputers where several uniprocessor computers, each having its own operating system, are connected in a network (such as Ethernet).
Tightly Coupled Processor
For reasons of efficiency, the hardware of a single microprocessor (hereinafter “processor”) in a tightly coupled multiprocessor is usually divided into the following two parts:                (1) a processing unit (hereinafter the “PU”) that is used to execute the operations of a program being run on the parallel computer containing the multiprocessor; and        (2) a switch that is used to handle communication between the processor and other processors in the computer.In each processor, the PU and the switch are logically coupled. Typically, the PU and the switch are electrically coupled.        
Switch
Each switch has a certain number of external ports and internal ports. FIG. 1A illustrates a prior art processor 110 divided into a PU 112 and a switch 114, where switch 114 includes four external ports labeled E1, E2, E3, and E4, and two internal ports labeled I1 and I2. An external port of one switch can be connected to an external port of another switch by a cable. Only one cable can be connected to each external port. It is possible that an external port has no cable connected to it. The two internal ports, I1 and I2, connect the switch, such as switch 114, to the PU, such as PU 112. A switch has the capability of making internal connections between pairs of its own (internal and external) ports, thus making cable connections between different PUs.
A typical switch 114 includes at least four external ports, at least two internal ports, and the switching capability of a full crossbar, such that given an arbitrary pairing of the ports of the switch, the switch can be set to connect the two ports in each pair.
Switch Connections
Switches may be interconnected. For example, prior art FIG. 1B shows two processors (PU/switch combinations) 110 and 120, where processor 120 includes a PU 122 and a switch 124. Switch 114 is set to connect ports E1 and I1 and to connect ports E4 and 12, as shown in FIG. 1B. Switch 124 is set to connect ports E1 and I1, to connect ports E2 and 12, and to connect ports E3 and E4, as shown in FIG. 1B. Also shown is a cable 130 between port E4 of switch 114 and port E1 of switch 124. As a result, port 12 of switch 114 is connected to port I1 of switch 124, thus connecting the two PUs.
A connection between ports J and K is represented by the pair (J,K). A setting of a switch is a set of connections between its ports, such that each port appears in at most one connection pair. For example, the setting of switch 114 in FIG. 1B is represented by the set {(E1,I1), (E4,I2)}. The setting of switch 124 in this figure is represented by the set {(E1,I1), (E2,I2), (E3,E4)}.
The set of connections may be empty, indicating that no ports of the switch are connected to one another. Connections can be dynamically added to and removed from a switch setting. A connection can be removed at any time.
A connection can be added if and only if it does not use a port of the switch that is already in use by an existing connection. For example, the connection (E2,E3) can be added to the setting {(E1,I1), (E4,I2)} of switch 114 in FIG. 1B. But the connection (E1,E3) cannot be added because port E1 is already in use by the connection (E1,I1) in the setting for switch 114 in FIG. 1B.
Interconnection Architecture
Due to physical constraints, each switch can have only a small number of ports, so a switch (and therefore its PU) can be directly connected to only a small number of other switches (PUs). It is possible that, due to both physical and electrical constraints, the length of each cable cannot exceed some specified amount. The way that the cables are placed between external ports forms the interconnection architecture of the computer: the placement of these cables is fixed. (Although the cables might be pluggable into the ports, if the placement of the cables is changed then this would constitute another interconnection architecture.)
Cellular Structure
The processors are typically arranged in a regular structure, often called a cellular structure. In one very common cellular structure, the processors are placed at the cells of a 1-, 2-, or 3-dimensional array. An array is defined by specifying the length of the computer in each dimension, where the length is given by the number of processors. In the case of a 2-dimensional array, for example, and naming the two dimensions X and Y, the array is specified by the length LX in the X dimension and the length LY in the Y dimension. The array contains a total of LX×LY processors. For example, FIG. 1C shows a 2-dimensional array 140 with LX=5 and LY=4 containing a total of 20 processors. Each processor in the array is identified by its coordinates in the array, as shown in FIG. 1C. These coordinates also identify the PU and the switch comprising the processor. In a 3-dimensional array, each processor (PU and switch) is identified by a triple (x,y,z) giving the coordinates of the processor in the X-, Y-, and Z-dimension, respectively.
Connecting External Ports of Switches
An interconnection architecture of the computer specifies the way that cables are placed between external ports of switches. Typically the cabling is done for each dimension separately. In the case of a 3-dimensional array, for example, the switch is divided into an X-switch, a Y-switch, and a Z-switch, each having its own four external ports and two internal ports. A cable can connect an external port of one switch to an external port of another switch only if the two switches have the same dimension (e.g. both are X-switches).
Again in keeping with the separation of dimensions, the computer is divided into 1-dimensional “lines” in each dimension. Within a line, all coordinates except one have a constant value, while the non-constant coordinate ranges over all possible values of that coordinate. For example, FIG. 1D shows the X-line 152 where the coordinate y is fixed at 1 and shows the Y-line 154 where the x coordinate is fixed at 4.
In order that the computer have a simple and regular structure, and using dimension X as an example, cables are placed only between switches that belong to the same X-line, and all X-lines in the computer typically have the same cabling structure. For example, in a 3-dimensional computer of length LX by LY by LZ, a cabling for the X dimension (the cables to be placed between X-switches belonging to the same X-line) is specified by a cabling of one line having length LX. The cabling of this one line is replicated for all X-lines in the computer. Thus, to specify a cabling architecture for a “regular” computer of this type, it suffices to specify three cablings, one for a line of length LX, one for a line of length LY, and one for a line of length LZ.
Mesh and Torus Interconnection Architectures
Two common, prior art interconnection architectures are the mesh architecture and the torus architecture. For example, as shown in FIG. 1E, a prior art mesh architecture 160 shown includes switches 161, 162, 163, 164, 165, 166, 167, and 168. Also, for example, as shown in FIG. 1F, a prior art torus architecture 170 includes switches 171, 172, 173, 174, 175, 176, 177, and 178.
Again using dimension X as an example, in mesh architecture 160, the X-switches in an X-line are connected in a linear fashion, namely switches 161, 162, 163, 164, 165, 166, 167, and 168. In torus architecture 170, the X-switches are connected in a cyclic fashion, namely 171, 173, 175, 177, 178, 176, 174, 172, and back to 171. Although FIGS. 1E and 1F show a mesh and a torus for a line of length eight, it is clear how these can be extended for a line of arbitrary length.
Torus architecture 170 could be obtained from mesh architecture 160 by adding a cable between switch 161 and switch 168. However, this would likely violate a limitation on the length of a cable. To keep the cables short, the cycle is “folded” as shown in FIG. 1F.
Mesh and torus architectures are defined for 2- and 3-dimensional arrays by replicating the mesh and torus cabling of a line in FIGS. 1E and 1F to all the X-lines, Y-lines, and Z-lines in the computer, respectively.
Partitioning
One important factor in the usefulness of an interconnection architecture is the flexibility it has to partition the computer into several independent pieces. Partitioning is important to allow several programs, or “jobs”, to run on the computer simultaneously. When initiating the running of a job, a user specifies a “partition”, the part of the computer that will be dedicated to this job. A “user” can be either a human user or a part of the system software such as a job scheduler. A partition of a computer is a set of PUs that are being used by one job.
Specifying PUs
A partition P is specified by giving, for each dimension, a set PX of coordinates in the X dimension, a set PY of coordinates in the Y-dimension, and a set PZ of coordinates in the Z-dimension. Then the PU with coordinates (x,y,z) belongs to partition P if and only if x belongs to PX and y belongs to PY and z belongs to PZ. In other words, the set of coordinates of the PUs is the Cartesian product of PX, PY, and PZ. For example, in an 8-by-8-by-8 3-dimensional computer, a user might specify a partition by the set PX={3,4} in the X-dimension, the set PY={3,5} in the Y-dimension, and the set PZ={1} in the Z-dimension. The PUs that belong to this partition are the PUs with coordinates (3,3,1), (3,5,1), (4,3,1) and (4,5,1).
Partitions are formed and released dynamically as jobs start and finish, respectively. To prevent one job from interfering with another job, different jobs cannot use the same PU and different jobs cannot use the same cable. Different jobs can use the same switch, but the use of a switch is restricted by the requirement that different jobs cannot use the same PU or cable.
Specifying Connection Type
In addition to specifying the PUs in the partition, the user also specifies a connection type, or architecture, for the partition. Two very common connection types are the mesh architecture and the torus architecture. Specifying a connection type reflects the fact that if the user has obtained a partition of a computer, the user would like his or her partition to “look like” a smaller version of the entire computer.
Mesh Architecture
The mesh architecture, such as mesh architecture 160, has the desirable property that every partition can be interconnected as a (in general, smaller) mesh by setting the switches properly. For example, FIG. 1G shows how the switches would be set so that the partition {163,164,166,167} is interconnected as a prior art mesh via connections 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, and 190. More specifically, FIG. 1G shows how the switches would be set so that the partition {163,164,166,167} is interconnected as a prior art mesh via internal couplings 180, 182, 183, 185, 187, 188, and 190 and external connections 181, 184, 186, and 189. FIG. 1G also illustrates that a connection between two PUs, such as PUs 164 and 166, can be made by two or more external connections in series, such as external connections 184 and 186. PU 165 may be “skipped” if PU 165 (a) belonged to another existing partition or (b) was faulty. An external connection may be implemented with a cable, an optical fiber, or another types of electromagnetic coupling.
In greater generality, a multiplicity of partitions can exist simultaneously, with each one interconnected as a mesh, provided that two different partitions do not “overlap”. More precisely, define the span of a 1-dimensional partition (a set of coordinates) to be the set of coordinates lying between and including the smallest coordinate in the partition and the largest coordinate in the partition. For example, the span of the partition {163,164,166,167} is {163,164,165,166,167}. The requirement that two partitions do not overlap is that their spans do not contain a coordinate in common.
Overlapping partitions in the multiple-dimension setting are generalizations of 1-dimensional case overlapping partitions. In the case of three dimensions, for example, if a 3-dimensional partition P is defined by the Cartesian product of the sets of PX, PY, and PZ of coordinates, then the span of P is the Cartesian product of the span of PX, the span of PY, and the span of PZ. Two 3-dimensional partitions P and Q overlap if the span of P and the span of Q contain a coordinate in common. If P is defined by PX, PY, and PZ, and if Q is defined by QX, QY, and QZ, then P and Q overlap if either PX and QX overlap, or PY and QY overlap, or PZ and QZ overlap.
Torus Architecture
The torus architecture, such as torus architecture 170 in FIG. 1F, does not have the desirable property that every partition in a multiplicity of non-overlapping partitions can be made to have the interconnection structure of a (smaller) torus. As illustration, this holds for any two partitions of size two or more. For example, the partition {171,172} can be interconnected as a torus, but only by using all of the cables in the line. Therefore, this partition cannot exist simultaneously with any torus-interconnected partition of size at least two, for example, {176,177}.
Number of Connections Used
With PX being a partition of an X-line, and with NX being the number of coordinates in PX, if NX≧2, any torus interconnection of PX uses at least NX external connections, or cables, in the X-line. The same fact holds for the Y and Z dimensions.
Interval Partition
A 1-dimensional partition P of a line is an interval partition if P is a set of consecutive coordinates, such as {173,174,175,176,177}. A 3-dimensional partition P of an array is an interval partition if PX, PY, and PZ are all 1-dimensional interval partitions. P is an interval partition if and only if the span of P is the same as P itself.
Therefore, a method and system of interconnecting processors of a parallel computer to facilitate torus partitioning is needed.