There are many computing problems that are amenable to parallel processing. In parallel processing on a parallel computing system an overall problem is typically divided into multiple sub-problems, or processes, each of which is then assigned to run on a particular processor of a number of processors, since the processors can then execute in parallel, rather than in serial, the overall problem is solved more quickly than with a single processor. Applications that have been run on parallel processing computer systems include cryptanalysis, weather forecasting, and many kinds of simulations.
In many parallel processing applications, a process running on one processor will need results from another processes running on another processor. For example, if a logic simulation of a microprocessor system is divided into a process simulating a RALU, another simulating a control unit, a third simulating a first level cache of a memory system, and a fourth simulating upper level memory, from time to time the process simulating the RALU may need to receive data from, and send data to, the process simulating the first level cache, and, when the simulated cache scores a “miss”, the cache process will need to communicate with the process simulating the upper level memory.
It has been found that rapid, reliable, communications between processors within a parallel computing system is essential to successful execution.
A massively-parallel computer system is one in which there are large numbers of processors, each of which has at least some program and data memory associated with it, typically operating in a multiple-instruction, multiple-data (MIMD), processing model.
In provisional patent application Ser. No. 13/425,136, a parallel computing system is described that is adapted to communicate with “scatter-gather” and “all to all” operations. In the scatter-gather operation, a message is sent from a first processor of the system to other processors of the system; the message either includes data being sent to those processors, or includes a request for the other processors to return specific data to the first processor.
Traditionally, the all-to-all communication exchange uses the binomial-tree multicast model. FIG. 69 illustrates an example of a binomial-tree multicast all-to-all exchange, showing the first of four tree broadcast interactions. Each processing element performs an iteration of the exchange. If “n” is the total number of processing units then it takes “nlog2n” (see: FIG. 1) communication steps to complete. As can be seen in FIG. 69, each communication step consists of a pair of processing units in communication.
It is noted that pair-wise communication is considered safe, as loop-back and other checks, including parity or other checksum checks combined with acknowledgment packets, can be performed on the data to insure that it arrived unchanged, and a retry can be initiated if corruption occurs. Unacknowledged broadcast communications are considered unsafe, since the sending processor may not recognize and correct communication errors.