The present invention relates to a method for calculating the binding free energy for interactions between biomolecules. More particularly, the present invention relates to a method that employs computational neural networks to discover quantum mechanical features of enzyme active site transition states, as well as quantum mechanical features required for binding of putative enzyme inhibitors. The present method is also applicable to discovering quantum mechanical features required for the binding of a potential ligand to a biological receptor. Computer-readable media may be incorporated with information enabling the method of the present invention to be performed on a general-purpose computer.
Enzymatically catalyzed reactions are characterized by geometric and electrostatic distortions of a substrate molecule into a transition state. The formation and stabilization of these transition states by enzymes are accompanied by increases in the rate of catalysis on the order of 10.sup.10 -10.sup.15 times faster than the uncatalyzed reaction. It is thought that an enzyme binds the transition state of a substrate molecule more tightly than either the substrate or the product. As a result, chemically stable molecules that mimic the substrate transition state should be potent inhibitors of enzyme activity.
The de novo design of transition state inhibitors requires accurate models of the enzyme-stabilized transition state. Advances in theory and computational chemistry have produced good models of stable molecules and enzymatic transition states from kinetic isotope effect experiments. However, the development of computational and theoretical methods of prediction of the binding constant of a putative inhibitor, prior to synthesis, would facilitate the discovery of novel inhibitors. These methods could be used to search chemical libraries for transition state mimics.
There is a long history of the development of methods to predict the binding of biological agents to enzymes or receptors. The methods generally fall into one of two categories. The first is the use of docking or molecular dynamics studies to investigate the interactions of substrates with a variety of biological molecules, such as enzymes or receptor sites. The second is the use of Quantitative Structure Activity Relations (hereinafter "QSARs"), which usually investigate the properties of potential therapeutic agents in the absence of their biological target. Each of these methods have advantages and disadvantages.
The concept of docking or molecular dynamics studies is that application of physical laws of motion or static interaction can be applied to biological systems to predict the strength of interaction of a substrate with a complex biological molecule. In general, biological macromolecules and their substrates form a system too large for ab initio quantum chemical methods to be used to generate electronic potential energies, and so parametrized classical force fields are employed on which classical mechanics simulations can be run. This is a massive technology with an equally huge literature. A variety of algorithms have been developed with allow efficient integration of Newton's equations. In addition, Monte Carlo methods are of critical importance in this field. There have also been recent advances in mixing quantum and classical mechanics, such as the surface hopping methods in chemical physics. Because even this calculation is challenging for complex systems, static methods, which treat parts of a system as dielectric continua, have been employed. These approaches have been employed in docking studies, in which substrates are virtually oriented and bound to an active site of an enzyme or other biomolecule.
As important as these approaches are, there are still difficulties in the application of these approaches to drug design and analysis. First, there are a great many approximations inherent in the development of force field and dielectric continua models. Though these methods have been studied for many years, it is still difficult to know which approximations are appropriate. Second, these calculations are difficult to perform, even when the approximations are appropriately employed, and require significant computer resources. As a result, the use of such docking studies in searches of libraries of potential candidate drugs for interactions with a particular target enzyme is impractical. Third, a structure for the biomolecule of interest is required for these docking studies. If, for example, only a DNA sequence is available, these methods cannot be employed.
The second method for predicting biological activity, QSARs, focuses on the substrate molecule. The concept assumes that there is a database of experimental evidence from which inferences can be drawn as to the effectiveness of other molecules. Specific properties of substrate molecules, such as hydrophobicity, the presence of certain groups, steric parameters, etc., are empirically fit to experimentally determined biological activities. The assumption is that once this fitting is appropriately performed, an accurate prediction of biological activity of a previously untested molecule can be made by examining the molecule for the same properties. These predictions have become quite sophisticated, and structural parameters used gave been expanded to include quantum mechanical features of substrates such as electrostatic potential surfaces. For example, point-by-point comparison of quantum mechanical electrostatic potentials on molecular van der Waals surfaces has been used to predict inhibition constants for transition state inhibitors for the reaction catalyzed by AMP deaminase and AMP nucleosidase. While these approaches are powerful, there are a number of difficulties. First, one must be able to identify a specific feature that determines bioactivity. When multiple features are involved, it is difficult to determine how the interactions of these features affects bioactivity. Second, a single QSAR may not be able to identify bioactivity trends when a number of different mechanisms are present (as where, for example, an enzyme protonates some substrates but not others). Finally, in addition to the practical prediction of bioactivity among libraries of potentially bioactive compounds, it is desirable to develop a theoretical approach that can help identify the features of the candidate molecules that are important, and thereby help to elucidate unknown mechanisms in the bioactivity.
The present invention is based on prior work on molecular similarity measures which compare electrostatic potential surfaces on the van der Waals surface of two different molecules. Two different molecules having similar electrostatic potentials have been found to have similar binding properties. Therefore, strong electrostatic similarity to an experimentally determined transition state result in strong binding, and a powerful transition state inhibitor. The similarity measure is defined as follows: ##EQU1##
wherein .epsilon..sup.A.sub.i is an electrostatic potential on molecule A at position i, .epsilon..sup.B.sub.i is an electrostatic potential on molecule B at position i, r.sub.ij is a distance between points i and j on a surface, and .alpha. is a decay constant employed so that points that are very far apart do not strongly affect the similarity measure. The similarity measures were first applied to transition state inhibitors for the reactions catalyzed by AMP deaminase, adenosine deaminase, and AMP nucleosidase. Transition state structures for each enzyme were obtained by kinetic isotope experiments (Kline et al., J. Biol Chem. 269:22385-22390(1994); Ehrlich et al., Biochem. 33:8890-8896 (1994)). Electrostatic potentials were then calculated for the transition states, the substrates, and the putative inhibitors. These were obtained using the GAUSSIAN 94 quantum chemistry package (from Gaussian, Inc., Pittsburgh, Pa.). Minimal basis sets (STO 3G) were used in the initial studies, and were thereafter confirmed using higher order basis sets. The molecules were oriented with respect to each other to maximize geometric overlap, and the electrostatic similarities were calculated.
For these simple reactions, the calculated numerical similarity was a reasonable predictor of binding free energy. FIG. 1 shows a plot of experimentally determined binding free energy versus electrostatic similarity (Se) for the AMP nucleosidase reaction and for three transition state inhibitors. The three inhibitors fall reasonably closely to a line defined by connecting the binding free energy versus Se for the substrate AMP and the transition state. Similarly, FIG. 2 shows a plot of experimentally determined binding free energy versus electrostatic similarity for adenosine deaminase. One transition state inhibitor, 1,6-dihydropurine ribonucleoside is a significant outlier, but the remaining results are quite good.
Despite these encouraging initial results, it soon became apparent that the similarity measure does not accurately predict all cases of inhibitor enzyme binding. The reason is that the Se treats all points on the van der Waals surface equivalently. It is thus quite possible to have a perfect configuration for binding in the region of an inhibitor molecule that interacts with the active site, but have significant differences remote from this site. As a result, the similarity measure would produce a result that would predict weaker binding than the actual binding free energy. Similarly, a molecule that initially looks very different from a transition state inhibitor could be changed electrostatically by its interaction with the enzyme, by e.g., protonation, to a form that might have a high binding free energy. Again, the similarity measure would predict a weaker binding than what would be measured experimentally. Furthermore, there are many reasonable algebraic similarity measures that may be applied in each case, and choosing the most appropriate measure would require extensive computational resources.
An artificial neural network is a computer algorithm which, during a training process, can learn features of input patterns and associate these with an output. After the learning phase is completed, the trained network enables the computer to predict an output for a pattern not included in the training process. Neural networks have been used in a small number of cases to study biological activity prediction. For example, Kohonen self-organizing maps have been used to transform the three-dimensional surface of biomolecules to a two-dimensional projection (Gasteiger et al., J. Am. Chem. Soc. 116:4608-4620 (1994)). Similarly, the molecular electrostatic potential at the van der Waals surface has been collapsed onto a series of 12 autocorrelation coefficients, and these were used in a neural network (Wagener et al., J. Am. Chem. Soc. 117:7769-7775 (1995)). In both these cases, potentially useful data were discarded as the three-dimensional surface information was converted to a two-dimensional representation. Neural networks have also been used to predict the mode of action of chemotherapeutic agents (Weinstein et al., Stem Cells 12:13-22 (1994)). Finally, neural networks have been used to predict biological activity from discrete QSAR descriptions of molecular structure (So and Richards, J. Med. Chem. 35:3201-3207 (1992)). However, this approach fails if the correct QSAR is not selected.
It is therefore desirable to have a method that can accurately predict binding free energy for a wide variety of potential inhibitors. It is also desirable to have a method for determination of binding free energy that identifies those regions of a potential inhibitor or other bioactive molecule that are especially important in binding, and thereby help elucidate unknown binding features. Furthermore, it is desirable to have a method for determining binding free energy that would adjust itself in each case to the form most suited to that particular case.