Molecular dynamics simulations of biological macromolecules 1, 2! are computationally demanding, owing to the large number of particles, as well as the complex nature of their associated interactions. There are generally two approaches to simplify this enormous computational problem: one is to use simpler physical models for the protein, the other is to develop more efficient theories or numerical methods without modification of the physical model. Focusing on the second approach, applicants have developed numerically more efficient methods to reduce the computational burden.
The most time-consuming part of the simulation is the calculation of the long-range pairwise Coulombic forces. The computational complexity for these Coulombic interactions scales as O(N.sup.2) , where N is the number of particles, and O is the order. Since the interaction decays with distance as O(r.sup.-1), it is normally not appropriate to use short cutoffs for electrostatic interactions, especially for large proteins where the long-range effects of inhomogeneous charge distributions are thought to be important 3, 4!.
The second most time-consuming part of the simulation arises from the small time-steps required to accurately solve the equation of motion for the stiff bond stretch and bond bending vibrations, even though one may be primarily interested in events that occur over a much longer time scale. In standard numerical integration methods, such as those of the Stormer-Verlet variety 5, 6!, one is generally required to use a time-step smaller than one femtosecond in order to maintain an acceptable level of accuracy in the integration of the equations of motion.
There have been a number of efforts to solve the first problem, i.e., to reduce the computational complexity for the electrostatic interactions. Hockney et al. 7! proposed a particle/mesh method to address this problem by using a mesh over the computational domain. The source density is interpolated at the mesh points, then the Poisson equation is solved to obtain the potential values on the mesh, and the forces are computed from the potential. The computational complexity is thus reduced from O(N.sup.2) to O(NlogN). The mesh provides only limited resolution and highly nonuniform source distributions cause significant errors. To improve accuracy, a particle-particle/particle-mesh method was developed 8!, in which the potential arising from short-range interactions is calculated directly. Appel 9! and Barnes et al. 10! developed a "tree code" method in place of the grid methods, in which the computational region is organized into a tree structure. The tree structure makes it possible to develop a rapid systematic procedure to determine which particles are "distant" from each other. By exploiting the fact that a particle interacts with a distant group of particles much as if it were interacting with a single particle at the center of mass (monopole) of the distant group, the complexity is also reduced to O(NlogN), although there is again some loss of accuracy by using the monopole approximation.
Greengard and Rokhlin 11! developed the Fast Multipole Method (FMM) based on the "tree code" idea, but with higher order multipoles in addition to the simple monopole approximation. The FMM method first organizes multipole representations of charge distributions in hierarchically structured boxes, then transforms those multipoles into local field expansions, so that each particle interacts with the local field generated from distant particles. The multipoles are generated by direct calculation in the lowest level and successive shifting from lower levels to upper levels. This invention describes a top-down recursive method to generate the multipoles in the hierarchical box tree, in which the multipoles are calculated recursively from the top of the tree instead of from the bottom as in Greengard's FMM method. At each level of the tree, the method first looks for charged particles in every box. If there is no charged particles in a particular box, then the multipoles and local field expansions for that box and all its subdivided boxes are assigned automatically to be zero without any further calculations. This is more efficient for nonuniform or noncubic systems, such as proteins.
A variety of techniques have also been introduced to address the second problem, i.e. to increase the time-step in molecular dynamics (MD) simulations. One common approach is to constrain bond lengths using either the SHAKE or RATTLE methods 12, 13!. Although application of these methods allow for a modest increase in time-step, time dependent quantities may be affected 14, 15!. Additionally, the constraint methods have been shown to work poorly for bond angle degrees of freedom when applied to macromolecules 14!.
Another approach to increase the time-step in MD simulations is that of the multiple time-step methods 16!. These methods are based upon integration schemes that allow for different time-steps according to how rapidly a given type of interaction is evolving in time. Teleman and Jonsson 17! introduced an method whereby the slower degrees of freedom are held constant for a number of smaller time-steps, which is usually called the long-range constant force approximation. This method has, however, been shown to lead to the accumulation of numerical error in calculated quantities 18, 19!. Alternatively, Swindoll and Haile 20! introduced a procedure which uses a Taylor series approximation for the less rapidly evolving forces. Although this method has been shown to give some improvement in central processing unit (CPU) times for simple systems such as alkane chain liquids 20!, it is not yet evident whether it would be computationally advantageous in the case of macromolecules. This invention describes a multiple time-step method specifically for macromolecular simulation which uses a combination of time-steps of different lengths to integrate interactions which evolve on different time scales. The method is essentially a generalization of the reversible Reference System Propagation Approach (r-RESPA) 21!, which employs a Trotter factorization 22! of the classical Liouville operator as a means to derive a numerical propagation scheme for the system. This r-RESPA scheme is a time reversible, symplectic (measure conserving in phase space), and highly stable integrator. This approach has been shown to be considerably more efficient than standard techniques when applied to simple systems or small proteins 19, 21, 23, 24!.