This application relates to the field of parallel computing and networking, and more specifically to all-to-all personalized exchange in multistage networks.
Communication among processors may be a design issue when, for example, a parallel processing system is built, or a parallel procedure is designed. With advances in silicon and Gaxe2x80x94As technologies, processor speed may soon reach the gigahertz (GHz) range. Traditional metal-based communication technology used in parallel computing systems is becoming a potential bottleneck. Progress in the traditional interconnects, or to in the new interconnect technologies, such as optics, would be well received in the parallel computing systems community.
Advances in electro-optic technologies have made optical communication a promising networking choice to meet the increasing demands for high channel bandwidth and low communication latency of high-performance computing/communication applications. Fiber optic communications offer a combination of high bandwidth, low error probability, and gigabit transmission capacity. They have been extensively used in wide-area networks and have received much attention in the parallel processing community as well. In fact, many commercial, massively parallel computers, such as the Cray C90, use optical technology in their communication subsystems.
Using new optical technologies in parallel computers may require us to reexamine the design of interconnection networks, and use of parallel processing procedures. Exploring the capabilities of the optical technology may involve a careful analysis of the properties of optics, a proposal of new performance measures, a design of new interconnection networks and routing procedures, and new parallel processing application procedures.
Multistage interconnection networks (MINs), hereafter also referred to as multistage networks, have been used for interconnecting purposes in parallel computing systems. A MIN can be blocking, such as a Banyan network, rearrangeably non-blocking, such as a Benes network, non-blocking, such as crossbar. Additionally, a MIN may have variably connecting capabilities, from rearrangeable for permutation to non-blocking for multicast, such as in a Clos-network, depending on the number of stages, the number of switches, the switch capability, and the interconnection patterns used between stages.
As optical technology advances, there has been a growing interest in using optical technology for implementing interconnection networks and switches. Although electronic MINs and optical MINs have many similarities, here are some fundamental differences between them. Because of some unique properties of optics, traditional routing procedures and results may not be applicable here. New research addressing optical MINs may be useful.
In the communication pattern known as all-to-all personalized exchange, every processor in a processor group sends a distinct message to every other processor in the group. All-to-all personalized exchange occurs in many important parallel computing/networking applications, such as matrix transposition and fast Fourier transform (FFT).
The issue of realizing all-to-all personalized exchange in optical multistage networks is examined below. A basic element of optical switching networks is a directional coupler with two inputs and two outputs (hereafter referred to simply as switches). Depending on the control voltage applied to it, an input optical signal is coupled to either of the two outputs, setting the switch to either the parallel or the crossing state. A class of topologies that can be used to construct optical networks is multistage interconnection networks (MINs), which interconnect their inputs and outputs via several stages of switches.
Advances in electro-optic technologies have made optical communication a promising networking choice to meet the increasing demands for high channel bandwidth and low communication latency of high-performance computing/communication applications. Although optical multistage networks hold great promise and have demonstrated advantages over their electronic counterpart, they also introduce new challenges such as how to deal with the unique problem of avoiding crosstalk in the optical switches, which may occur when two signal channels in a switch interact with each other.
There are two ways in which optical signals can interact in a planar switching network. The channels carrying the signals may cross each other in order to embed a particular topology. Alternatively, two paths sharing a switch may experience some undesired coupling from one path to another within a switch. It would be desirable to achieve optical exchange in multistage networks in a manner that reduces crosstalk.
Methods and systems are presented below involving optical multistage interconnection networks (MiNs). Although optical MINs hold great promise and have demonstrated advantages over their electronic counterparts, they also present their own problems. Due to certain optical properties, crosstalk in optical switches should be avoided if they are to work efficiently. The concept of a semi-permutation is introduced to analyze the permutation capability of optical MINs under the constraint of avoided crosstalk in several types of MINs. In particular, an optimal scheme for realizing crosstalk-free all-to-all personalized exchange in a class of unique-path, self-routing optical multistage networks is presented.
The basic idea of realizing all-to-all personalized exchange in such a multistage network is to transform it to a collection of the above-mentioned semi-permutations. Each of the semi-permutations can be realized crosstalk-free in a single pass (i.e., in a single, concurrent exchange of signals) and can take advantage of pipelined message transmission in consecutive passes.
More specifically, a method for crosstalk-free all-to-all exchange in an optical multistage network having inputs and outputs coupled to processors is presented. The method comprises sending messages between the processors in multiple passes, wherein, in each of the multiple passes, each of the processors transmits, in one-to-one fashion, a message to one of the processors by way of the inputs and outputs, in accord with semi-permutations decomposed from permutations corresponding to rows of a matrix. The optical multistage network may be one of a baseline network, omega network, Banyan network, and their reverse networks.
In each of the multiple passes, each of the processors may transmit, in one-to-one fashion, a message to one of the processors by way of the inputs and outputs, in accord with semi-permutations decomposed from permutations corresponding to rows of a Latin square. The semi-permutations may be obtained by computing two input sets.
A method is also presented for crosstalk-free all-to-all exchange in an optical multistage network having n inputs and n outputs coupled to n processors, where n xcex5{2, 4, 8, 16, . . . } and wherein each of the n processors is connected to one of the n inputs and one of the n outputs, comprising computing a Latin square having n rows and n columns; associating the n rows with n admissible permutations, each of the n admissible permutations being a one-to-one mapping from N={0,1, . . . ,nxe2x88x921} to itself; for j a member of the set {1, 2, . . . , n}, decomposing a jth permutation, from among the n admissible permutations, into two semi-permutations, one of the two semi-permutations, s(j), being a restriction of said mapping to a subset, S(j), of N having n/2 elements, and another of the two semi-permutations, t(j), being a restriction of said mapping to a subset, T(j), of N where T(j) is a complement N S(j); sending n/2 messages in a (2jxe2x88x921)th pass, and n/2 messages in a (2j)th pass, each of the messages departing from one of the n processors, traveling through one of the n inputs and one of the n outputs, and arriving at one of the n processors, wherein a kth processor from among the n processors sends a message to an Ith processor from among the n processors in the (2j-1)th pass if, and only if, s(j)(k)=1, and a qth processor from among the n processors sends a message to an rth processor from among the n processors in the (2j)th pass if, and only if, t(j)(q)=r, and for different j=1, . . . ,n, repeating said step of decomposing a jth permutation from among the n admissible permutations until all of the n admissible permutations have been decomposed; for different j=1, . . . ,n, repeating the step of sending n/2 messages in a (2jxe2x88x921)th pass, and n/2 messages in a (2j)th pass until n2 messages have been sent, and 2n passes have occurred, corresponding to 2n decompositions of the n admissible permutations.
Computing a Latin square having n rows and n columns may include computing a Latin square off-line. The optical multistage network may be one of a baseline network, omega network, Banyan network, and their reverse networks. The optical multistage network may include m=log2 n stages, each having one switch with two switch settings, interspersed by mxe2x88x921 interstage links, wherein computing a Latin square may include computing a Latin square so that each of the n rows corresponds, in one-to-one fashion, with a configuration of switch settings. The method may further comprise composing each of the n admissible permutations as a composition of 2mxe2x88x921 permutations, a jth permutation from among the n admissible permutations composed as
"sgr"(j)mxe2x88x921xcfx80(j)mxe2x88x922"sgr"(j)mxe2x88x922 . . . xcfx80(j)0"sgr"(j)0,
such that to each of the m stages there is associated a stage permutation and to each of the mxe2x88x921 interstage links there is associated an interstage link permutation, where the stage permutation corresponding to the jth permutation from among the n admissible permutations, and associated with an ith stage is denoted by "sgr"(j)i, with i=1, . . . , m, and the interstage link permutation corresponding to the jth permutation from among the n admissible permutations, and associated with an ith interstage link is denoted by xcfx80(j)i, with i=1, . . . , mxe2x88x921.
Also presented below is a method of achieving all-to-all crosstalk-free exchange in an optical multistage network, said network having an even number, nxe2x89xa72, of processors, comprising computing an nxc3x97n matrix   "AutoLeftMatch"      [                                        a                          0              ,              0                                                            a                          0              ,              1                                                …                                      a                          0              ,                              n                -                1                                                                                      a                          1              ,              0                                                            a                          1              ,              1                                                …                                      a                          1              ,                              n                -                1                                                                          ⋮                          ⋮                          ⋮                          ⋮                                                  a                                          n                -                1                            ,              0                                                            a                                          n                -                1                            ,              1                                                …                                      a                                          n                -                1                            ,                              n                -                1                                                          ]  
such that each entry of the matrix is chosen from the set N={0,1, . . . , nxe2x88x921}, and such that the members of each row equals the set N, and the members of each column equals the set N; mapping the matrix to a column vector of permutations,   "AutoLeftMatch"      [                                        ρ                          (              0              )                                                                        ρ                          (              1              )                                                            ⋮                                                  ρ                          (                              n                -                1                            )                                            ]  
where a jth entry of the column vector of permutations, xcfx81(j), is given by       (                            0                          1                          2                          …                                      n            -            1                                                            a                          j              ,              0                                                            a                          j              ,              1                                                            a                          j              ,              2                                                …                                      a                          j              ,                              n                -                1                                                          )    ;
decomposing each permutation, xcfx81(j), into two semi-permutations, s(j), and t(j), each of which can be realized crosstalk-free, given by             s              (        j        )              =                  (                                                            b                                  j                  ,                  0                                                                                    b                                  j                  ,                  1                                                                    …                                                      b                                  j                  ,                                                            n                      /                      2                                        -                    1                                                                                                                          c                                  j                  ,                  0                                                                                    c                                  j                  ,                  1                                                                    …                                                      c                                  j                  ,                                                            n                      /                      2                                        -                    1                                                                                      )            ⁢              xe2x80x83            ⁢              and                                t                  (          j          )                    =              (                                                            d                                  j                  ,                  0                                                                                    d                                  j                  ,                  1                                                                    …                                                      d                                  j                  ,                                                            n                      /                      2                                        -                    1                                                                                                                          e                                  j                  ,                  0                                                                                    e                                  j                  ,                  1                                                                    …                                                      e                                  j                  ,                                                            n                      /                      2                                        -                    1                                                                                      )              ,  
where s(j) is a restriction of the permutation xcfx81(j) to a subset, S(j), of N having n/2 elements, and t(j) is a restriction of the permutation xcfx81(j) to a subset, T(j), of N where T(j) is a complement N S(j); in a first pass, sending messages from processor b0j to processor c0j for j=0, . . . , n/2xe2x88x921; in a second pass, sending messages from d0j to processor e0j for j=0, . . . , n/2xe2x88x921; in a third pass, sending messages from processor b1j to processor c1j for j=0, . . . , n/2xe2x88x921; and in a fourth pass, sending messages from d1j to processor e1j for j=0, . . . , n/2xe2x88x921.
Also presented below is a method of achieving all-to-all crosstalk-free exchange in an optical multistage network, said network having an even number, nxe2x89xa72, of processors, comprising computing an nxc3x97n matrix   "AutoLeftMatch"      [                                        a                          0              ,              0                                                            a                          0              ,              1                                                …                                      a                          0              ,                              n                -                1                                                                                      a                          1              ,              0                                                            a                          1              ,              1                                                …                                      a                          1              ,                              n                -                1                                                                          ⋮                          ⋮                          ⋮                          ⋮                                                  a                                          n                -                1                            ,              0                                                            a                                          n                -                1                            ,              1                                                …                                      a                                          n                -                1                            ,                              n                -                1                                                          ]  
such that each entry of the matrix is chosen from the set N={0,1, . . . , nxe2x88x921}, and such that the members of each row equals the set N, and the members of each column equals the set N; mapping the matrix to a column vector of permutations   "AutoLeftMatch"      [                                        ρ                          (              0              )                                                                        ρ                          (              1              )                                                            ⋮                                                  ρ                          (                              n                -                1                            )                                            ]  
where a jth entry of the column vector of permutations, xcfx81(j), is given by       (                            0                          1                          2                          …                                      n            -            1                                                            a                          j              ,              0                                                            a                          j              ,              1                                                            a                          j              ,              2                                                …                                      a                          j              ,                              n                -                1                                                          )    ;
decomposing each permutation, xcfx81(j), into two semi-permutations, s(j), and t(j), each of which can be realized crosstalk-free, given by             s              (        j        )              =                  (                                                            b                                  j                  ,                  0                                                                                    b                                  j                  ,                  1                                                                    …                                                      b                                  j                  ,                                                            n                      /                      2                                        -                    1                                                                                                                          c                                  j                  ,                  0                                                                                    c                                  j                  ,                  1                                                                    …                                                      c                                  j                  ,                                                            n                      /                      2                                        -                    1                                                                                      )            ⁢              xe2x80x83            ⁢              and                                t                  (          j          )                    =              (                                                            d                                  j                  ,                  0                                                                                    d                                  j                  ,                  1                                                                    …                                                      d                                  j                  ,                                                            n                      /                      2                                        -                    1                                                                                                                          e                                  j                  ,                  0                                                                                    e                                  j                  ,                  1                                                                    …                                                      e                                  j                  ,                                                            n                      /                      2                                        -                    1                                                                                      )              ,  
where s(j) is a restriction of the permutation xcfx81(j) to a subset, S(j), of N having n/2 elements, and t(j) is a restriction of the permutation xcfx81(j) to a subset, T(j), of N where T(j) is a complement N S(j); in a (2ixe2x88x921)th pass, sending messages from processor bixe2x88x921, j to processor cixe2x88x921, j for j=0, . . . , n/2xe2x88x921, for i=1, . . . , n; in a (2i)th pass, sending messages from dixe2x88x921, j to processor eixe2x88x921, j for j=0, . . . , n/2xe2x88x921, for i=1, . . . , n.
A system for all-to-all crosstalk-free exchange is also presented below that includes an optical multistage network associated with a Latin square having n columns and n rows, wherein the optical multistage network includes an even number, nxe2x89xa72, of processors coupled to said processors; instructions for said processors to associate the n rows with n admissible permutations, each of the n admissible permutations being a one-to-one mapping from N={0,1, . . . , nxe2x88x921} to itself; for j a member of the set {1, 2, . . . , n}, instructions for said processors to decompose a jth permutation from among the n admissible permutations into two semi-permutations, one of the two semi-permutations, s(j), being a restriction of said mapping to a subset, S(j), of N having n/2 elements, and another of the two semi-permutations, t(j), being a restriction of said mapping to a subset, T(j), of N where T(j) is a complement N S(j); instructions for said processors to initiate the transmittal of n/2 messages in a (2jxe2x88x921)th pass, and n/2 messages in a (2j)th pass, each of the messages departing from one of the n processors, traveling through one of the n inputs and one of the n outputs, and arriving at one of the n processors, wherein a kth processor from among the n processors sends a message to an Ith processor from among the n processors in the (2jxe2x88x921)th pass if. and only if, s(j)(k)=1, and a qth processor from among the n processors sends a message to an rth processor from among the n processors in the (2j)th pass if, and only if, t(j)(q)=r; and for different j=1, . . . , n, instructions for said processors to continue to decompose a jth permutation from among the n admissible permutations until all of the n admissible permutations have been decomposed; and for different j=1, . . . , n, instructions for said processors to continue to send n/2 messages in a (2jxe2x88x921)th pass, and n/2 messages in a (2j)th pass until n2 messages have been sent, and 2n passes have occurred, corresponding to 2n decompositions of the n admissible permutations.
The optical multistage network may include one of a Banyan network, an omega network, a baseline network, and their reverse networks. The instructions for the processors to decompose a jth permutation from among the n admissible permutations into two semi-permutations may include computing two input sets in a time of the order O(n).
A system for crosstalk-free all-to-all exchange in an optical multistage network is also presented that comprises processors; and instructions for said processors to send messages between the processors in multiple passes, wherein, in each of the multiple passes, each of the processors transmits, in one-to-one fashion, a message to one of the processors in accord with semi-permutations decomposed from permutations corresponding to rows of a matrix. The matrix may be a Latin square.
The instructions for said processors to send messages between the processors may include instructions to compute input sets from which the semi-permutations may be obtained in a time on the order of O(n) where the Latin square is an nxc3x97n matrix. Instructions may be included to compute two input sets from which the semi-permutations may be obtained.