1. Field of the Invention
This invention relates generally to providing a method and system to facilitate molecular modeling, and more specifically to providing a method and system to approximate the electrostatics and electrodynamics functions used in molecular modeling software.
2. Description of the Prior Art
Significant advances have been made in predicting, studying, and quantifying the nature of interatomic and intermolecular interactions with the use of computer models. Although such computer models can be useful in testing and validating a theory concerning the nature of these interactions, computer models find especially useful application in reducing the time required to develop new materials with desirable properties. Materials that may be developed with additional efficiency through the use of computer models include polymers, pesticides, herbicides, pharmaceuticals, semiconductor materials for integrated circuits, and petrochemicals.
Several modeling techniques are used to study chemical systems. These techniques are referred to as simulations, models, methods, approximations, and so forth. Here all techniques are referred to as models. Typically, the model selected provides a user of the model with a particular compromise between physical accuracy and the computing resources required to run the model. Quantum mechanical calculations can be performed with a high degree of accuracy, but are very expensive in terms of the computer time required to perform them. In those fields described above, where new polymers, drugs, and other products are being developed, it is more useful to use modeling techniques that requires less investment in computing resources, so that more candidate materials or molecules can be analyzed in a shorter time period for the properties desired.
Force Fields
A general class of models called force fields is frequently used. A force field uses analytical classical equations to describe molecules, e.g., the potential energy of a diatomic molecule can be modeled with a harmonic equation. Occasionally non-analytic terms are included, e.g., induced polarization terms. These functions are significantly simpler computationally than performing the more correct and mathematically complex quantum mechanical calculations.
Force fields are developed to reproduce experimental data or calculation results, generally quantum mechanics calculations. The selection of appropriate parameters and force field functional dependencies is a significant factor in the success of the model. Furthermore, the number of fitted parameters used in the model relative to the number of measurable or calculable values relevant to the system being modeled should be as low as possible to minimize the possibility of chance correlations.
New force field functions are continually under development to increase agreement with experiment and quantum results, to maximize the number of observables relative to the number of parameters, and to limit the necessity of performing computationally expensive calculations.
The potential energy function is the energy of a set of atoms as a function of atomic coordinates. One class of potential energy functions is the force field function. For any chosen force field function and a given set of atoms, the location of the atoms, the types of atoms, and the bonds between the atoms define a potential energy field. For a given set of atomic coordinates (configuration) the energy is well defined. On rare occasions there is bond breaking and bond making, generally explicitly defined by the user. The potential energy field is usually expressed as a sum of terms, with each term representing a particular class of interaction.
Most force fields include terms that represent bonded interactions such as bond stretching and contraction, angle stretching and contraction, rotation about a dihedral bond, out-of-plane deformation, cross terms of the above and any other term or parameter that the method developer considers to be significant. These interactions are sometimes referred to as intramolecular interactions.
Non-bonded, or intermolecular interaction potential energy generally includes a van der Waals interaction and a model using electrostatic interactions. The van der Waals (vdW) term may take a variety of forms. In general, vdW terms are mildly attractive at large separations and become very repulsive at small atomic separations. Different known parameterizations of the van der Waals term, and intramolecular force parameters, may be used.
A general expression for a potential energy field generally used for molecular modeling may thus be described as follows:VE=intramolecular terms+van der Waals+electrostatic terms   (1)The energy, VE, is the value of the potential energy field for a particular configuration of atoms. Forces acting on the atoms are determined by calculating the gradient of the potential energy field at the atomic coordinates determined by the configuration.Charge Model
The electrostatic term is calculated using charges from a charge model. The charge model is generally developed for a set of atoms such as a molecule (composed of electrons and nuclei), an amino acid, a nucleic acid, a cell in a regular array, an aggregate, or other group of atoms (henceforth any said set of atoms is called a substructure).
The probability of electron position is provided by the quantum mechanics wavefunction, which is the solution to the Shroedinger Equation. This probability is generally considered to describe a charge distribution. Chemists frequently approximate this charge distribution with a classical description composed of point charges, dipoles, quadrupoles, and higher multipoles. The choice of charges and the methodology used to determine the charges is called a charge model.
Different techniques have been used to develop charge models. For example, the method pioneered by Kollmann et al. is disclosed in “Force Fields for Protein Simulations,” by J. W. Ponder and D. A. Case in Adv. Prot. Chem., vol. 66, pp. 27-85 (2003); and in “Molecular Dynamics Simulation of Nucleic Acids: Successes, Limitations, and Promise,” by T. E. Cheatham, III and M. A. Young in Biopolymers, vol. 56, pp. 232-256 (2001); which are hereby incorporated by reference. This method uses point charges placed at atom centers to reproduce the electrostatic field around the molecule as calculated by placing test charges in the molecule's periphery in a quantum mechanics calculation. In other methods the charge model reproduces other properties, such as hydrogen bond strength. In addition there are programs that assign charges based on the bonding structure of the molecule. For the purposes of this patent an atomistic charge model is a charge model that is any combination of the following: (1) assigns fractional electron charges to the vast majority of atoms in a chemical system, (2) assigns dipoles to the vast majority of the atoms in a chemical system, (3) assigns charges between the vast majority of atoms that are bonded, and (4) assigns dipoles between the vast majority of atoms that are bonded. Molecular modeling frequently uses this charge model to calculate the electrostatic potential between molecules and between different parts of the same molecule.
Dielectric Model
Molecular modeling simultaneously uses multiple independent models where the dielectric terms do not necessarily correspond with each other. The dielectric of a material is a macroscopic property. Poisson's equation describing electrostatics was developed to describe electrostatic interactions on a macroscopic scale, and the dielectric concept was incorporated into the equations to accommodate the dielectric properties of matter. However, there is no term for a dielectric in the Schroedinger Equation, which applies to a molecular scale. In the work of Kollman et al. noted above, the dielectric was not included in the classical equations. This has the effect of setting the classical dielectric to one, i.e., D=1. Alternately, a dielectric of 2 or 4 is frequently used for protein calculations.
There are various approaches to handle a dielectric when using classical electrostatic (macroscopic) equations to model molecules. The dielectric properties of the molecule are implicitly or explicitly incorporated into the model. When developing a charge model, the chosen dielectric is incorporated into the choice of partial charge, i.e., a charge is chosen which produces the desired properties given the dielectric used. In the charge model of Kollman et al., the charges create an electrostatic field about the molecule that approximates the quantum result given a dielectric of one. When there are two molecules, Coulomb's equation describes the interaction energy between them.
As an example of the prior art approach, consider the electrostatic potential between two charges qi and qj, at points i and j respectively, shown in FIG. 1 where qi is embedded in dielectric di 104, qj is embedded in dielectric dj, 108, and there is dielectric dbulk 106 in between di and dj. The electrostatic potential between qi and qj can be calculated in several steps. The electric field intensity, E, at point j due to qi is determined by calculating the electric field at point a given dielectric di, 104, calculating the change in E in traversing the dielectric dbulk 106 in going from point a to point b, and then calculating E in traversing the dielectric dj 108 from point b to point j. Electrostatic potential can then be calculated from E and the value of the charge qj.
Composite Charge Model
In the prior art, algorithms for calculation of intermolecular electrostatic potential were modeled by detailed calculation of charge-charge interactions.E=Σi>j(qiqj)/(Drij)   (2)In some situations a group of charges can be handled together. When the distance between the test point and the set of charges is sufficiently large, then the electrostatic field created by a set of charges is approximated with a function of the total charge, the total dipole, total quadrupole, and higher total charge multipoles. For example, the textbook “Electromagnetic Fields and Waves,” by Paul Lorrain and Dale Corson, 1970, (pp. 62, 65, and 70), which is hereby incorporated by reference, explains how to approximate an electrostatic potential. An excerpt of the approximation method follows.
“The potential due to an arbitrary charge distribution at a point outside the distribution such that r>r1 is the same as: (a) the potential of a point charge equal to the net charge of the distribution, plus (b) that of a point dipole with a dipole moment equal to the dipole moment of the distribution, plus (c) that of a point quadrupole with a quadrupole moment equal to that of the distribution, and so on, all located at the origin. where r1 is the maximal distance between the origin and a charge (distribution). The potential due to a dipole moment varies as 1/r2 and the potential due to a quadrupole moment varies as 1/r3.” The impact of this result is evident in the charges found in charge models.
The point charge model has produced individual atom-atom interactions that are large. For example, consider the potential calculated using the Amber 1998 [reference] parameter set. The potential between the two aliphatic terminal carbons in two amino acid Leucine molecules that are 45 Angstroms (A) apart is 10 kcal/mole. (i.e., QiQj/rij=300(0.3)(0.3)/45=10 kcal), which is large. From an experimental perspective, the electrostatic potential between the side chains in Leucine molecules that are 45A apart is negligible. The apparent contradiction is resolved by the composite charge result noted above; in a calculation the large 10 kcal/mole term is counterbalanced by the similar terms of opposite sign in neighboring atoms and the result is small.
The large number of charges generated by the charge model introduces further problems. There may be very large number of electrostatic interactions. For example, in a protein with 3000 atoms, there are ˜n2/2 interactions or ˜4,500,000 terms. In many applications, the electrostatic potential is sampled repeatedly and the computer time required becomes quite large. Significant effort has been expended to decrease the computer time. Three techniques are noted below.
Cutoff distances have been explored where all interactions from charges greater than an “R” distance apart are set to zero. “R” is called the cutoff distance. This produces a discontinuous function at the cutoff distance, and various means have been developed to smooth this function. While smoothing makes the function differentially continuous, the cutoff approach introduces large error into the calculations.
In U.S. Pat. No. 5,612,894, which is hereby incorporated by reference, Wertz discloses an alternative procedure to reduce the number of electrostatic calculations. This procedure determines the sensitivity (importance) of a desired quantity to a given interaction and restricts the calculation of interactions with low sensitivity. This method requires techniques to recalculate whether a given interaction is important and techniques to “remember” the important ones. The interactions are separated into classes of varying importance. In some cases terms that could be in the same category to provide the counterbalancing charge terms noted above will be in different classes.
In U.S. Pat. No. 5,915,230, which is hereby incorporated by reference, Berne et al. disclose an approximation for electrostatic potential at a distance from a set of charges. A three-dimensional space is divided into cubic volumes (boxes) and the charges inside a box are handled as a set. Unfortunately the division into boxes is artificial and counterbalancing charges can be in different boxes. Charge models depend on the canceling effects at large distances as noted above. Separating the opposite, canceling charges into different boxes creates computational artifacts in the functions and makes them less stable.
FIG. 2 illustrates a flowchart of a prior art method to calculated electrostatic energy. The method starts in operation 202. Operation 204 is next and includes determining all sets of two charges i and j. Operation 206 is next and includes calculating the distance rij between each set of charges i and j. Operation 208 is next and includes calculating the electrostatic energy between charge i and charge j. Operation 210 is next and includes adding the electrostatic energies from all sets of charges i and j. The method ends in operation 212.
As can be seen from the preceding discussion, the prior art methods for approximating potential energy functions are frequently not practical for complex molecular modeling involving many atoms. What is needed is a more accurate approximation of potential energy functions for molecular modeling, which will provide sufficient accuracy for the modeling of even complex groups of atoms.