1. Field of the Invention
The present invention relates in general to electromagnetic scattering calculations. More specifically, it relates to an efficient and time-saving apparatus and method for using parallel multi-processors and the Fast Multipole Method (FMM) to perform electromagnetic scattering calculations.
2. Description of Related Art
Much of the computing world is moving from sequential, single processor configurations to parallel processor configurations. One example of a parallel processor configuration 10 is shown in FIG. 1. As illustrated, the configuration 10 includes a plurality of asynchronously executing processors 11 in communication with one another via a communications network 12. Each processor 11 generally includes at least a CPU 14 and a memory 16. In performing "parallel processing", the software functions or processes that are associated with a particular task and/or computation are divided and distributed among the various processors 11 connected via the network 12. Additionally, the particular task or computation may require that the processors 11 share information with each other via the network 12.
Parallel multi-processor networks may be embodied in a variety of other configurations. For example, FIGS. 2a and 2b illustrate a "mesh" configuration 17 and a "fully connected" configuration 18, respectively. In a six-processor mesh configuration 17 any processor 11 is at most three "hops" from any other processor 11. The fully connected network 18 shown in FIG. 2b places every processor 11 in direct communication with every other processor 11, and within one "hop" of every other processor 11. The number of hops that separates the processors is directly related to the ease with which information is shared among the processors. Reducing the number of hops in a given communication path generally shortens the communication time for that path.
The general goal of parallel multi-processor networks is to require less processing time than a single processor or non-parallel processors. Adding more processors to a parallel multi-processor network adds computing power, which, in theory, should reduce the overall execution time for completing tasks. However, for computations that require the processors to share information, the time spent on distributing shared information to the processors can reduce, and in many cases cancel, whatever time savings may have been achieved by adding processors to the network.
The "speedup" value is a measurement of the effectiveness of adding processors to a parallel multi-processor network. As shown below in equation 1, speedup is the ratio of the time it takes to execute an algorithm for a problem of size N on one processor over the time it takes to execute an algorithm for a problem of size N on P processors. ##EQU1## If the above ratio is approximately P, the algorithm is referred to as scalable. In other words, for every added processor there is a proportional, linear decrease in the execution time. FIGS. 3a and 3b illustrate scalable and non-scalable systems, respectively. However, for computations that require the parallel processors to share information, the time spent on distributing the shared information to the processors makes it difficult to achieve the scalability illustrated in FIG. 3a.
When electromagnetic energy is incident on a conductive material, surface currents are induced on the object being irradiated. These surface currents radiate electromagnetic energy which can be detected and measured. This is the basic principle by which radar detection systems work.
Referring to FIG. 4, an object, referred to as a scatterer S, is shown. The scatterer S represents any conductive object that can be irradiated with plane waves of electromagnetic energy. P.sub.1 to P.sub.N represent N different plane waves that irradiate the scatterer. Q.sub.1 to Q.sub.N represent radiated energy from the scatterer resulting from the surface currents induced on the object by the irradiating plane waves. This energy from the scatterer S is also known as scatter or scattering amplitude. The scattering amplitude is the ratio of the reflected (or scattered) energy to the incident energy. For many applications it is necessary to irradiate the scatterer S from many different angles on many different sides, often referred to as look angles or right hand sides, when measuring scattering amplitude.
Scattering amplitude measurements are useful in a variety of applications such as circuit and antenna modeling. However, the most common use of scatter is in radar systems. Radar systems rely on the principle that all conductive objects re-radiate incident electromagnetic energy in patterns that are unique for that object's particular size, shape and other physical characteristics. In other words, all unique objects possess unique "electromagnetic fingerprints". Once the scatter information of an object has been thoroughly determined in all directions, this information can be used to identify that object. Many defense-related aircraft are designed to produce as little scatter as possible in order to minimize the likelihood of the aircraft being detected by an enemy radar system.
The electromagnetic scatter of an object can be determined by empirical methods or by simulation/computation methods. Empirical methods can be costly and time consuming to implement. For example, making scatter measurements is time consuming because the scatterer must be characterized in both the horizontal and the vertical planes. In other words, the scattering characteristics of the scatterer must be know from any point in space surrounding the object. Additionally, empirical scatter measurements are typically made from an actual scatterer by placing the scatterer on a table that rotates during the measurements. When the scatterer is a large and/or expensive object such as an aircraft, it is obviously very expensive to conduct such measurements for every form of scatterer under consideration.
Computational methods use mathematical models to determine the scattering characteristics of a given object, thus eliminating the need to actually construct the object. In general, computational methods resolve an irradiating source into surface currents on the scatterer. Once the surface currents are known, the re-radiated field from the scatterer can be computed. Conventional computational methods break the scatterer into N small pieces (unknowns) for computational analysis. Note that a higher number of unknowns results in a more precise simulation. FIG. 5 illustrates a scatterer S.sub.0 broken into N=14 different components. Equation 2 forms the fundamental basis for known computational methods. EQU ZI=V (Equation 2)
The variable V is given by the characteristics of the incident electromagnetic plane wave.
Equation 2 also relies on the impedance matrix representation Z of the scatterer S.sub.0. The impedance matrix Z is determined, through known methods, by the geometry and the material composition of the scatterer. Note that the size of Z is directly related to the number of sections N that the scatterer has been broken into. Specifically, the matrix Z has the dimensions N by N or N.sup.2. Based on the above development, the only tasks that remain are the determination of the current vector I and the re-radiated energy which follows directly from V and I.
A first method of determining I is to simply solve Equation 2 for I. This requires the calculation of Z.sup.-1 as shown in Equation 3. EQU I=Z.sup.-1 V (Equation 3)
The calculation of Z.sup.-1 requires N.sup.3 calculations or N.sup.3 time. Note that the number of calculations directly relates to the time it takes to make the calculations. Therefore, N.sup.3 calculations can be referred to as taking N.sup.3 time. This method works for sufficiently small N values. However, most practical simulations require N values on the order of 100,000 or larger. With N=100,000 unknowns, massive amounts of computing power and memory are necessary. Thus, larger N values often require the use of an expensive super-computer to calculate Z.sup.-1. One of the largest N values ever used to calculate Z.sup.-1 was N=225,000. The computation resources used for this operation cost about $20-$30 million dollars.
An alternative to calculating the inverted Z matrix is to iteratively solve Equation 4 based on changing I'. An initial value for I' is chosen, and, if the results of the calculation do not agree with Equation 4, the next value of I' is determined by known methods. This iterative solution continues until V-ZI' is acceptably close to 0. EQU V-ZI'=0 (Equation 4)
This method requires N.sup.2 calculations per iteration in I'. Theoretically, N iterations in I' are required to make V-ZI' exactly 0. Thus, this method yields an exact result only after N.sup.3 calculations (N.sup.2 calculations per iteration and N iterations). In practice, however, fewer iterations in I' are needed in order to obtain reasonable accuracy. Accordingly, this alternative method, in practice, provides some computational advantage over the method that relies on inverting the Z matrix.
A further reduction in time can be achieved through using the Fast Multipole Method (FMM) which requires N.sup.3/2 time per iteration, instead of N.sup.2 time. FMM uses a similar scatterer segmentation method to that shown in FIG. 5. However, as shown in FIG. 6, FMM places the N small pieces (unknowns) into M groups, preferably M.varies.N.sup.1/2 For notational purposes, a particular section (element) of the scatterer which was referred to as "n" in FIG. 5 is now referred to by a group number and an element number within that group (m, a). For example, n=9 in a FIG. 5 corresponds to (2, 3) in FIG. 6.
Collecting the scatterer elements into groups allows for the decomposition of the impedance matrix Z into sparse components Z', V, and T such that Equation 5 holds. EQU Z.apprxeq.Z'+VTV.sup..dagger. (Equation 5)
Fields due to currents within a group m are tabulated at K points, or far-field directions, on a unit sphere. In Equation 5, T is the translation operator, which serves to translate tabulated fields between groups that are far away. At a given group, the translated fields from all far away groups are summed and used to compute the interactions on the current elements within the given group. For interactions due to groups nearby a given group, a sparse portion Z' of the regular impedance matrix is used. The variables used in Equation 5 can all be calculated from Equations 6-8. ##EQU2## In Equation 6, the number inserted for the variable L depends on the desired accuracy, k is the unit vector in the kth far field direction, X.sub.mm' is the unit vector from the center of the group m to group m' and h.sub.1.sup.(1), P.sub.1 are the Hankel function and the Legendre polynomial, respectively, of order 1. EQU V.sub.nk =e.sup.ik.multidot.(xn-Xm) (Equation 7) EQU Z'.sub.nn' =G(r) (Equation 8)
In equations 7 and 8, x.sub.n is the location of the unknown n (See FIG. 6), X.sub.m is the center of the group m that contains n, r=.vertline.x.sub.n -x.sub.n' .vertline. for unknowns n, n' in nearby groups, and G(r) is the Green function.
Additional details regarding FMM can be found in the following documents: R. F. Harrington, Field Computation By Moment Methods, MacMillian, New York, N.Y., 1968; R. Coifman, V. Rokhlin, and S. Wandzura, The Fast Multipole Method: A Pedestrian Prescription, IEEE Antennas and Propagation Society Magazine, June 1993, pages 7-12; V. Rokhlin, Rapid Solution Of Integral Equations Of Classical Potential Theory, J. Comp. Phys., 60 (1985), pages 187-207; M. A. Stalzer, Parallelizing The Fast Multipole Method For The Helmholtz Equation, Proceedings of the Seventh SIAM Conference on Parallel Processing for Scientific Computing, February 1995, pages 325-330; and M. A. Stalzer, A Parallel Fast Multipole Method For The Helmholtz Equation, Parallel Processing Letter, 5 (1995), pages 263-274. The entire disclosure of each of the above-listed documents is incorporated herein by reference.
A substantial amount of memory can be saved by calculating the components of Z'+VTV.sup..dagger. as needed to form the product ZI'. Computing an exponential function uses more computer resources than performing a simple multiplication. It is therefore desirable to compute a small piece of each component, then use it many times. This fits well with the solution of scattering problem because multiple look angles, or right hand sides, must often be calculated in order to provide a sufficiently thorough representation of the scatterer.
Because scattering calculations are desired for very large scatterers, there is a need to couple the FMM for electromagnetic calculations with the enhanced speed of scalable parallel processing. Providing a scalable parallel processing algorithm for the FMM would allow scattering calculations to be executed on a large number of available processors. Accordingly, the present invention is directed at providing a method and apparatus that performs FMM on a scalable multi-processor network.