Three-dimensional Fast Fourier Transforms (FFTs) are critical to a number of numerical algorithms, in particular for the group of methods that are used in N-body simulations of systems with electrostatic forces, termed “Particle-Mesh” or “Particle-Particle-Particle-Mesh”. As multidimensional FFTs are computationally intensive problems, they are often calculated on large, massively parallel networks, such as in a distributed computing environment. The implementation of a FFT on a network having a distributed memory, however, raises certain problems. A distributed computing network requires communication of instructions and data between nodes, which is computationally costly and time-consuming. Also, a network having a distributed memory requires management of memory access across the distributed memory. Further, the computation of a FFT on a network having a distributed memory requires appropriate distribution of the work associated with calculating the FFT among the multiple nodes comprising the network.
One approach to this problem is the “slab” decomposition which allows scaling (or distribution of work) among N nodes for a three dimensional N×N×N matrix of input data. This approach, however, does not allow for further, more extensive scaling among additional nodes. Therefore, a need exists to overcome the problems with the prior art as discussed above, and particularly for a way to make the computation of a FFT on a distributed memory network more efficient.