A. Field of the Invention
The present invention relates generally to hardware accelerators, and, more particularly to components and methods for facilitating implementation of a finite-difference time-domain (FDTD) hardware accelerator.
B. Description of the Related Art
No longer relegated to radio-frequency (RF) engineers, antenna designers, and military applications, electromagnetic analysis has become a key factor in many areas of advanced technology. From personal computers (PCs) with processor speeds approaching three (3) gigahertz (GHz) and wireless computer networks, to personal digital assistants (PDAs) with Internet capabilities and the seemingly ubiquitous cell phone, it seems that almost every electronic design now requires electromagnetic characterization. To facilitate this analysis, numerical techniques have been developed that allow computers to easily solve Maxwell's equations.
Maxwell's equations are a system of coupled, differential equations:∇·D=qev ∇·B=qmv
                              ∇                      ×            H                          =                ⁢                              J            i                    +                      σ            ⁢                                                  ⁢            E                    +                                    ∂              D                                      ∂              t                                                                        ∇                      ×            E                          =                ⁢                              M            i                    -                                                    ∂                B                                            ∂                t                                      .                              As such, they can be represented in difference form, thus allowing their numerical solution. To see this, recall that the definition of the derivative is:
            f      ′        ⁢          (      x      )        =            lim                        Δ          ⁢                                          ⁢          x                →        0              ⁢                                        f            ⁢                          (                              x                +                                  Δ                  ⁢                                                                          ⁢                  x                                            )                                -                      f            ⁢                          (              x              )                                                Δ          ⁢                                          ⁢          x                    .      
Implementing both temporal and spatial derivatives of Maxwell's equations in difference form produces the numerical technique known as the finite-difference time-domain (FDTD) method. In this approach, a region of interest is sampled to generate a grid of points, hereinafter referred to as a “mesh.” The discretized forms of Maxwell's equations are then solved at each point in the mesh to determine the associated electromagnetic fields.
Although FDTD methods are accurate and well defined, current computer-system technology limits the speed at which these operations can be performed. Run times on the order of hours, weeks, months, or longer are common when solving problems of realistic size. Some problems are even too large to be effectively solved due to practical time and memory constraints. The slow nature of the algorithm primarily results from the nested for-loops that are required to iterate over the three spatial dimensions and time.
To shorten the computational time, people acquire faster computers, lease time on supercomputers, or build clusters of computers to gain a parallel processing speedup. These solutions can be prohibitively expensive and frequently impractical. As a result, there is a need in the art to increase the speed of the FDTD method in a relatively inexpensive and practical way. To this end, people have suggested that an FDTD accelerator, i.e., special-purpose hardware that implements the FDTD method, be used to speed up the computations. (See, e.g., J. R. Marek, An Investigation of a Design for a Finite-Difference Time Domain (FDTD) Hardware Accelerator, Air Force Inst. of Tech., Wright-Patterson AFB, M. S. Thesis (1991); J. R. Marek et al., A Dedicated VLSI Architecture for Finite-Difference Time Domain Calculations, presented at The 8th Annual Review of Progress in Applied Computational Electromagnetics, Naval Postgraduate School (Monterey, Calif. 1992); R. N. Schneider et al., Application of FPGA Technology to Accelerate the Finite-Difference Time-Domain (FDTD) Method, presented at The 10th ACM Int'l Symposium on Field-Programmable Gate Arrays, (Monterey, Calif. 2002); and P. Placidi et al., A Custom VLSI Architecture for the Solution of FOTO Equations, IEJCE Trans. Electron., vol. E85-C, No. 3, pp. 572–577 (March 2002)). Although limited success in developing hardware-based FDTD solvers has been shown, the related art still needs a practical, hardware-based solver. There are several reasons for this.
First, the conventional FDTD algorithm contains several distinct regions that require different mathematical expressions. These include the normal FDTD space, the absorbing boundary region, and the incident source condition. For software implementations, these regions are relatively simple to incorporate into a solver. Unfortunately, hardware designs are most efficient when they are asked to perform only one task. Incorporating functionality within a hardware accelerator to detect special regions and handle them differently increases hardware logic and slows the overall hardware design.
Second, a hardware-based FDTD solver puts great demands on the underlying memory architecture. Every node in the mesh requires storage of at least three electric fields and at least three magnetic fields, with each field being thirty-two (32) or sixty-four (64) bits. Also, every node has associated material parameters, including three permittivities, three permeabilities, and three conductivities. Again, these are likely to be thirty-two (32)-or sixty-four (64)-bit numbers. For a ten (10) million-node mesh, this requires several gigabytes of memory. Storage of the fields, however, is not the only memory concern. Just as important is the time to retrieve or fetch data from memory and to write updated results back to memory. This latency presents serious problems to practical hardware implementations.
Third, floating-point operations are desirable to maximize precision and to minimize numerical dispersion. However, floating-point operations tend to be slow and require specialized hardware.
Finally, hardware system control can be a daunting task. Millions of field components are constantly being retrieved or fetched from memory and updated field values are continuously being written back. For maximum throughput, it is desired to have a finely-tuned system in which all components are working together quickly and efficiently. However, the hardware control architecture necessary to oversee this can be very complex.
Thus, there is a need in the art to overcome these limitations, and to provide for practical hardware-based FDTD solvers.